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Brief Reports 


The Journal of Consulting Psychology will 
accept Brief Reports of research studies in 
clinical psychology for early publication with- 
out expense to the author. The procedure is 
intended to permit the publication of soundly 
designed studies of specialized interest or lim- 
ited importance which cannot now be ac- 
cepted because of lack of space. Several pages 
in each issue will be devoted to Brief Reports, 
published in the order of their receipt with- 
out respect to the dates of receipt of the regu- 
lar articles. Most Brief Reports appear in the 
first or second issue to go to press following 
their final acceptance. 


An author who wishes to submit a Brief 
Report: 


1. Sends the Brief Report, limited to one printed 


page and prepared according to the specifications 
given below. 


2. Also sends to the Editor a full report of the re- 
search study, in sufficient detail to give a clear ac- 
count of its background, procedure, results, and con- 
clusions, which will be filed with the American 
Documentation Institute to insure indefinite avail- 
ability. 


3. Prepares at least 100 mimeographed copies of 
the full report, which the author will send without 


charge to all who request it as long as the supply 
lasts. 


4. Agrees not to submit the full report to another 
journal of general circulation. 


Specifications 

Brief Report. The Brief Report should give 
a clear, condensed summary of the procedure 
of the study and as full an account of the re- 
sults as space permits. 

To insure that the Brief Report will be no 
longer than one printed page, its typescript, 
including all matter except the title and the 
author’s lines, must not exceed 75 lines av- 


eraging 42 characters and spaces ia length. 
Set the typewriter margins for short lines of 
42 characters, which are 3.5 inches long in 
elite typing, and 4.2 inches long in pica. 

The manuscript of the Brief Report must 
be double spaced throughout. Except for its 
short lines, it follows the standard style (1). 
Headings, tables, and references are avoided 
or, if essential, must be counted in the 75 
lines. Each Brief Report must be accom- 
panied by a footnote in the style below, 
which is typed on a separate sheet and not 
counted in the 75-line quota: * 


1An extended report of this study may be ob- 
tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. (giving the author’s full name 
and address), or for a fee from the American Docu- 
mentation Institute. Order Document No. ——, re- 
mitting $—— for microfilm or $—— for photo- 
copies. 


Extended report. Because the extended re- 
port is intended for photoduplication, and is 
not copy to be sent to a printer, its style 
should differ in several ways from that of 
other manuscripts: (a) The extended report 
should be typed with single spacing for 
economy in duplication. (b) Tables and fig- 
ures should be placed adjacent to the text 
which refers to them. A caption should be 
typed below each figure. (c) Footnotes should 
be typed at the bottom of the page on which 
reference is made to them. In other respects, 
the full report is prepared in the style speci- 
fied by the Publication Manual (1). 


Reference 


1. American Psychological Association. Council of 
Editors. Publication manual of the American 
Psychological Association (1957 rev.). Wash- 
ington, D. C.: American Psychological Asso- 
ciation, 1957. 
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The Prediction of Length of Stay in Psychotherapy’ 


Maurice Lorr, Martin M. Katz, and Eli A. Rubinstein 


U. S. Veterans Administration 


Much needed by mental hygiene clinics is 
a method of early identification of patients 
who for various reasons terminate treatment 
after only a few visits and without the ad- 
vice or consent of the therapist. Various sur- 
veys have shown that anywhere from 30% to 
60% of patients referred to a mental hygiene 
clinic and accepted for psychotherapy will 
terminate within the first six treatment visits 
(4, 7). In a substantial number of cases ter- 
mination occurs even before the initial inter- 
view with the therapist. Early identification 
of premature terminators would permit the 
clinic to adapt its intake procedure in ap- 
propriate ways. Special procedures designed 
to motivate or orient potential terminators 
might be instituted. Or, other treatment mo- 
dalities such as tranquilizer drugs might be 
utilized or devised. 

In an earlier study, four short psychologi- 
cal tests and questionnaires were found to be 
predictive of length of stay in treatment (9). 
Patients from 10 Veterans Administration out- 
patient clinics throughout the country who 
remained in psychotherapy for at least six 
months were compared with those who ter- 
minated within a month. The patients con- 
sisted of nonpsychotic males in intensive in- 
dividual therapy who had not been seen in 
treatment previously. The items were selected 
on the basis of a double cross-validation on 
two random halves of a sample of 128 cases. 
Results indicated that remainers may be 
characterized as less nomadic, less impulsive, 
less rigid in personal attitudes, and more self- 
dissatisfied than terminators. In addition, re- 
mainers tend to have more education, perhaps 
because of greater goal-directed persistence, 


1 From Veterans Administration, Veterans Benefits 
Office, Washington, D. C. 


and they tend to come from higher socioeco- 
nomic levels than terminators. 

One aim of the present study was to test 
the validity of the predictive test battery on 
a wider range of clinic cases. Another objec- 
tive was to test a series of research hypothe- 
ses concerning the characteristics that differ- 
entiate patients who terminate prematurely 
from patients who remain in treatment. These 
hypotheses grew out of a survey of the litera- 
ture and out of the above mentioned research 
results of the initial study (2, 6, 12). 

The specific hypotheses, in each case from 
the standpoint of remainer characteristics, 
are as follows: 

1. Terminators are more likely to have a 
history of frequent trouble with the law, lack 
of impulse control, hostility to authority, lack 
of goal persistence, and lack of personal ties 
or loyalties. 

2. Terminators are less self-dissatisfied. 

3. Terminators are less likely to report 
anxiety. 

4. Terminators have more limited vocabu- 
laries. 

5. Terminators are more authoritarian. 

6. Terminators’ socioeconomic level is lower. 


Method 


A nation-wide group of 13 clinics agreed to 
collect data over a period of three months on 
all cases considered for psychotherapy (group 
or individual).? Patients examined for hos- 
pitalization, neurological disorders, or for brief 
(half hour or less) irregular treatment were 
not included. 


2We are indebted to the VA Mental Hygiene 
Clinics and respective clinic staffs in the following 
cities for collaborating in the collection of the basic 
data: Baltimore, Brooklyn, Chicago, Cleveland, Den- 
ver, Miami, Newark, Pittsburgh, St. Louis, San An- 
tonio, San Francisco, Seattle, and Washington, D. C. 
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The intake interviewer, usually a social 
worker, completed a form at the close of the 
interview describing such patient character- 
istics as his occupation, annual earnings, and 
presenting problems. In addition, the inter- 
viewer estimated how long the patient would 
remain in treatment if accepted. In this way 
the test battery’s success in predicting length 
of stay could be compared with the judgment 
of an interviewer unaided by test information. 
The therapist filled out a data sheet when a 
patient discontinued treatment prior to the 
26th visit for any reason, or at the end of six 
months of treatment if the patient was still in 
treatment at that time. The therapist also in- 
dicated the reason for termination if this oc- 
curred, the frequency and duration of visits, 
and the type of treatment given. 

The measures used, given below, were self- 
administerable and required 45 minutes or 
less for completion. 


1. A 39-item Behavior Disturbance scale taken 
from a longer unpublished inventory devised by 
Applezweig and Dibner. The true-false items include 
elements primarily of a biographical nature designed 
to elicit information concerning the extent of certain 
behavior in the patient’s past. The items ask ques- 
tions concerning lack of personal ties or loyalties, 
lack of impulse control, restlessness, frequent trouble 
with the law, lack of ethical standards, hostility to 
authority, and lack of goal persistence. 

2. A Self-Rating scale consisting of 18 five-point 
graphic rating scales. First the patient rates his 
actual self and then he rates himself the way he 
would like to be on the same scales. The difference 
between the self-rating and the ideal-rating pro- 
vides a measure of self-dissatisfaction. 

3. A 30-item version of the Taylor Manifest Anx- 
iety Scale which purports to measure manifest anx- 
iety (10). 

4. A 15-item multiple choice vocabulary test, which 
represents a modification of the Thorndike selection 
of words in the Stanford-Binet Vocabulary Test. 

5. A 20-item F scale taken from Adorno, Frenkel- 
Brunswik, et al. (1). The scale purports to be a 
measure of authoritarianism and conventionalism as 
defined by the authors. The patient indicates his de- 
gree of agreement with each statement on a four- 
point scale. 


In all, approximately 300 usable cases were 
received from the 13 clinics. Cases were ex- 
cluded from the study if (a) the patient ter- 
minated prior to six full months from the date 
of initiation of treatment with mutual consent 
of the therapist and was rated improved; (5) 
the interval between the initial intake inter- 
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view and the initial treatment was over 10 
weeks; (c) the patient did not receive an av- 
erage of at least one treatment every four 
weeks during his stay in treatment; (d) treat- 
ment was terminated for such reasons as 
transfer, hospitalization, ineligibility, or re- 
location to another city; and (e) the patient 
was female. The first four qualifications made 
the group a more clear-cut sample of termina- 
tors and remainers. The sample was confined 
to male patients because so few female veter- 
ans are seen in VA clinics. 

The criterion used in this study was num- 
ber of weeks in treatment. The number of 
treatments received was considered as a pos- 
sible criterion but rejected since the number 
of treatments scheduled varies from patient to 
patient. Furthermore, the number of treat- 
ments is likely to be a less reliable figure in 
most clinics than sheer length. However, the 
correlation of .60 between the two variables 
indicated that the measures have much in 
common. The entire initial sample of 291 
cases was randomly split into two samples: 
A and B. Patients were so assigned that each 
clinic and each therapist had approximately 
equivalent representation in Subsamples A 
and B. The distribution by weeks of treat- 
ment was also equated in the two samples. 

The criterion distribution was found to be 
U-shaped; only 60 cases were in treatment for 
more than 6 weeks but less than 26 weeks. 
It was also evident, as in the original study, 
that the middle group could not be differ- 
entiated by the test battery from the two end 
groups. Accordingly patier.ts with 6 weeks of 
treatment or less were designated Terminators, 
and patients with 26 weeks or more of treat- 
ment were designated Remainers. The middle 
group of patients staying from 7 to 26 weeks 
was not used in the present analysis. Each 
subsample consisted of 115 cases, with 57 
Terminators and 58 Remainers in each sample. 

To test the research hypotheses, and to 
identify the items in Sample A significantly 
related to the dichotomous criterion, point 
biserial correlations and phi coefficients were 
computed and tested for significance.* The 
items within a given inventory or test which 
correlated with the Terminator—Remainer cri- 


8 The statistical assistance of Elizabeth Turk is 
gratefully acknowledged. 
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terion at the .05 level of significance or better 
were combined with unit weights into a sub- 
test. The answer sheets were then rescored for 
each subtest, and the subtests were correlated 
with the criterion. The Lubin-Summerfield 
square root procedure (8) was applied to the 
six variables most highly correlated with the 
criterion. This procedure selects from a group 
of variables the minimum set of effective vari- 
ables which will result in the highest possible 
multiple correlation coefficient with the cri- 
terion. 

A configural approach was next applied to 
the minimum set of effective variables selected 
by the square root procedure. The configural 
approach was utilized because it takes into 
account the interaction among variables, and 
any nonlinear relationships which are not in- 
cluded in a linear multiple regression equation. 
Each of the four subscores selected was di- 
chotomized at the mean and each patient in 
Sample A was allocated either to the upper 
or lower half of each variable. In each case, a 
high score represented a Remainer score, while 
a low score typified the Terminator. The com- 
bination of Upper and Lower categories on the 
four variables yields 16 patterns. Each patient 
was Classified into one of the 16 patterns. The 
percentage of Terminators and Remainers fall- 
ing into each pattern was then determined. 
Finally the percentages were transformed into 
a series of “configural” scores ranging from 0 
to 9 by the following equation (11): 


%Remainersingiven Lowest %Remainers 


pattern in any pattern 
S=9 





Highest % Remainers 
in any pattern 


Lowest % Remainers 
in any pattern 


These configural scores were then correlated 
with the dichotomous criterion. 

The beta weights derived from Sample A 
were then used to score the 115 cases in 
Sample B plus an additional 16 cases received 
late from one clinic. Similarly, the 16 patterns 
derived from Sample A were used to score the 
cases in Sample B. Point-biserial correlations 
were then computed with length of stay to 
determine the degree of cross-validity obtain- 
ing for the four subtests selected. 

To complete the double cross-validation, the 
items for Sample B were independently cor- 
related with the criterion as had been done for 
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Sample A. An item was selected for the final 
set if the item-criterion correlation was sig- 
nificant at the .05 level or better in both sam- 
ples (a one-tailed test). 


The Findings 


The first five research hypotheses were 
tested through a comparison of Terminator 
and Remainer scores on the a priori keys for 
the five tests. A result was regarded as sig- 
nificant if the point-biserial correlation was 
significant at the .05 level or better on both 
Samples A and B. None of the a priori keys 
proved to be differentiating in both samples 
although all scores were consistently in the 
predicted direction. 

Differences found between Terminators and 
Remainers in annual earnings, occupational 
level, and highest grade completed were also 
in the predicted direction. However, the differ- 
ences were not at the required significance 
levels in both samples. An examination of the 
data by clinic suggested that there might be 
substantial clinic differences. Consequently, 
data on annual earnings from five of the 
clinics with the largest samples were subjected 
to a two-way analysis of variance. No sig- 
nificant clinic differences and no significant 
interactions between clinics and the criterion 
were found. However, an F test significant at 
the .001 level and a point biserial of .28 indi- 
cated that annual earnings were indeed pre- 
dictive of termination. However, an analysis of 
occupational level and grade level by some- 
what different procedures failed to reveal sig- 
nificant differences between Terminators and 
Remainers for the same five clinics. 

The item analysis of Sample A identified six 
differentiating items in the F scale, eight in 
the Behavior Disturbance scale, eight items in 
the Manifest Anxiety scale, and five items in 
the Self-Rating scale. Each set of items was 
combined into a subscale. The correlations 
among the four subscales, the Vocabulary 
Test, the Social Worker’s Prediction, and the 
criterion were subjected to the square root 
analysis previously described. The criterion 
correlations of the six variables analyzed by 
this procedure are presented in Table 1. A 
multiple correlation of .67 significant at the 
.01 level was obtained between the F subscale, 
the Behavior Disturbance subscale, the Mani- 
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Table 1 


Correlations of Subscales and Social Worker’s 
Prediction with Terminator—-Remainer 
Criterion in Samples A and B 








Correlation 





Sample Sample 
Variable A B 





F scale 

Manifest Anxiety scale 
Behavior Disturbance scale ar am 
Self-Rating scale discrepancy _—" 10 
Vocabulary test 12 .21* 
Social Worker Prediction .36* .36* 


32° 07 
30°* aa 





** Significant at .01 level. 
* Significant at .05 level. 


fest Anxiety subscale, the Social Worker’s Pre- 
diction, and the criterion. Application of Sam- 
ple A beta weights to Sample B scores resulted 
in a combined test-score-criterion correlation 
of .39, which is also significant at the .01 level. 

The configural approach to the minimum 
set of four variables yielded a correlation of 
.62 for Sample A. Next, Sample B was scored 
on Sample A patterns. The correlation of Sam- 
ple B patients’ configural scores with the cri- 
terion yielded a correlation coefficient of .43. 
The configural approach thus results in as ef- 
fective a separation of Terminators and Re- 
mainers on cross-validation as multiple regres- 
sion. 

To complete the cross-validation, the indi- 
vidual items significantly correlated with the 
criterion in Sample A were checked on Sample 
B and those significant in Sample B were 
checked on A. By this procedure, 22 items 
proved to be significantly differentiating at the 
.05 level or better (one-tailed test) and in the 
same direction in both samples. The entire 
sample of 230 cases was then rescored on the 
significant 21 items allocated to th:ee sub- 
tests (Anxiety scale, F scale, Behavior Dis- 
turbance scale). Configural scores based on 
these three subtests and the social worker’s 
prediction correlated .50 with the criterion. 

The actuarial items were also correlated 
with the criterion in both Samples A and B. 
Only race and religion proved to be consistent 
in predicting termination of treatment. Negro 
patients were more likely to be terminators 
than other patients (r,»’s of .25 and .18). 


Jewish patients compared to all others were 
more likely to be remainers (r,,’s of .17 and 
.17). Tests were also made for type of psy- 
chiatric disorder, current psychiatric disability 
rating, number of previous courses of treat- 
ment, a listing of ten major problem areas 
(e.g., problems with authority figures, voca- 
tional adjustment, marital conflict), and mari- 
tal status. None were found to relate to the 
criterion at significant levels in both samples. 
Scheduled treatment frequency, length of treat- 
ment interview, type of treatment (group, in- 
dividual, or both) showed no significant rela- 
tion to the criterion. Likewise the analysis 
indicated that the sex, discipline, and length 
of experience of the therapist were unrelated 
to the criterion. 

The possibility remained that a position 
response set was operative in several of the 
original scales and was in part responsible for 
the scale-criterion correlations. The index of 
response set was defined as Rr/Ny —Rr/Np, 
where Rr is the number of items keyed True 
and marked True by the examinee, and Ry is 
the number of items keyed False and marked 
False by the examinee. Ny and Ny are the 
numbers of items keyed True and keyed 
False respectively. Correlations of response-set 
scores with the criterion were computed sepa- 
rately for the Manifest Anxiety and Behavior 
Disturbance scales. The correlations obtained 
were .08 and —.05 respectively. Evidently, 
then, response sets are not responsible for 
these scale-criterion correlations. However, the 
question remained whether the response set 
was acting as a suppressor variable. To check 
this possibility, content and set scores were 
correlated in each scale. Neither coefficient of 
—.14 nor .03 was significant. Thus response 
set as here defined exerted no influence on the 
scale-criterion correlations. 


Discussion 


As we indicated earlier, all of the scores 
based on the a priori keys were in the hy- 
pothesized direction on both samples but not 
at acceptable levels of significance. Of the 89 
items in the Behavior Disturbance scale, the 
F scale, and the Anxiety scale, 22 items were 
significantly correlated with the criterion at 
the .05 level or better in both samples. Only 
one of these significant items was keyed in a 
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direction not hypothesized. Consequently no 
other items are inconsistent with the hy- 
potheses. Do the remaining 21 items tend to 
be consistent with and to support the research 
hypotheses? 

The Behavior Disturbance items will be 
considered first. The research hypothesis states 
that Terminators are more likely than Re- 
mainers to report a history of antisocial be- 
havior, to lack impulse control, and to lack 
loyalties or personal ties. The following re- 
sponses seem to support these conjectures: 


I have been arrested three times or more. (True) 

I have never been in a reform school, prison, work- 
house, or jail. (False) 

“Every man for himself” is the wisest rule to fol- 
low. (True) 

I lose interest in things which I cannot get or do 
right away. (True) 

I have often spent more money than I had by bor- 
rowing on the spur of the moment. (True) 

Right now I have some money saved up. (False) 

I sometimes break a date with someone without 
telling him about it. (True) 

When we go out together, I sometimes walk off and 
leave my friends without telling them about it. (True) 


The third hypothesis states that Terminators 
are less likely to report anxiety than Re- 
mainers. The Taylor scale items correlated 
significantly with the criterion were the fol- 
lowing: 


I am often sick to my stomach. (False) 

I wish I could be as happy as others. (False) 

I find it hard to keep my mind on a task or job. 
(False) 

Life is often a strain for me. (False) 

I am not very confident of myself. (False) 

My hands and feet are usually warm enough. 
(True) 


Denial of these reactions certainly lends sup- 
port to the postulated lack of, or failure to 
report, anxiety in the Terminator patient. 

Are Terminators more “authoritarian” than 
Remainers as hypothesized? Terminators tend 
to agree with the following F scale items which 
are significantly related to the criterion: 


When a person has a problem or worry, it is best 
for him not to think about it, but to keep busy with 
more cheerful things. 

Nowadays when so many different kinds of people 
move around and mix together so much, a person has 
to protect himself especially carefully against catch- 
ing an infection or disease from them. 

Sex crimes, such as rape and attacks on children, 
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deserve more than mere imprisonment; such criminals 
ought to be publicly whipped, or worse. 

People can be divided into two distinct classes: 
the weak and the strong. 

There is hardly anything lower than a person who 
does not feel a great love, gratitude, and respect for 
his parents. 


Someday it will probably be shown that astrology 
can explain a lot of things. 

Nowadays more and more people are prying into 
matters that should remain personal and private. 


The statements describe an individual who be- 
lieves in the repression of conflict, who is lack- 
ing in psychological sophistication and insight, 
and who identifies with authority. To the ex- 
tent that these statements represent measures 
of “authoritarianism,” the fifth research hy- 
pothesis is supported. 

We next turn to a consideration of the pat- 
terns established. Let the order of the sub- 
scales be as follows: (a) F scale, (6) Manifest 
Anxiety scale, (c) Behavior Disturbance scale, 
(d) Social Worker Prediction. Let each fre- 
quency distribution be dichotomized at the 
mean. Next let U represent the upper or Re- 
mainer half and L represent the lower or 
Terminator half of each distribution. Then a 
patient allocated to the Terminator pattern 
LLLL is low on all four variables and is char- 
acterized as authoritarian, nonanxious, and as 
reporting an antisocial history; he is also rated 
Terminator by the social worker. An example 
of a Remainer pattern is UULU which repre- 
sents a patient who is nonauthoritarian, anx- 
ious, with a history of antisocial behavior, but 
rated Remainer by the social worker. 

An examination of the five most discriminat- 
ing Remainer patterns is revealing. Any com- 
bination of three or more Upper scores is pre- 
dictive of remaining in treatment. The Mani- 
fest Anxiety scale seems to be most crucial as 
judged by the percentage of successes; Re- 
mainers are anxious. The Terminators also are 
best identified by any combination of three or 
more Lower scores. Terminators may also be 
identified successfully by any three Low scores 
or by any two Low scores providing one is on 
the Anxiety scale. 

The discriminatory validity of the config- 
ural approach may be judged by the data 
presented in Table 2. The optimum cutting 
score for separating Terminators and Re- 
mainers in the cross-validation sample was 
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Table 2 


Number of Terminators (T) and Remainers (R) 
Classified into Each of the 16 Patterns 
(Patients in both Samples A and B are scored 
on Sample A patterns) 








Sample A 


Sample B 


Pattern R 


LLLL 
LLLU 
LLUL 
LLUU 
LULL 
LULU 
LUUL 
LUUU 
ULLL 
ULLU 
ULUL 
ULUU 
UULL 
UULU 
UUUL 
UUUU 


v=] 





So 


— 


Ye UND NKORWAEUNORO 


CK OWWAH ROH UNNWNO! 
—PWWNReERK OWN UU Re 
PONS WAWOCKHOCHOBMAWWA! HS 


—" 





first estimated. It is the point of intersection 
of the frequency distributions of configural 
scores for the two criterion categories (4). 
With this cutting score, 56 (or 80%) of the 
Terminators are correctly classified by the test 
battery. This compares favorably to the 55% 
Terminators one would estimate from the base 
rate in the present population. In all, 90 (or 
71%) of Terminators and Remainers are cor- 
rectly classified. 

The Remainer is thus seen as an anxious, 
self-dissatisfied individual with some psycho- 
logical insight who is willing to explore his 
personal problems with others. He has some 
sense of loyalty to others and tends to persist 
in activities he undertakes. He is not likely to 
have been involved in antisocial acts. On the 
other hand, the Terminator either is not anx- 
ious or does not admit to being anxious and 
self-dissatisfied. He is likely to have had a his- 
tory of antisocial acts, he admits to being un- 
dependable and impulsive, and may be au- 
thoritarian or rigid in his social attitudes. 

Agreement of these findings with previously 
reported studies is substantial. Frank e¢ al. 
(3), in their summary, list the following at- 
tributes as related to remaining in therapy: 
social class, education and occupation, fluctu- 


ating illness with manifest anxiety, readiness 
to communicate distress and personal liabili- 
ties, influenceability, social integrity, and 
perseverance. Similarly, Hiler (5) lists educa- 
tion, intelligence, level of ambition, tendency 
to be introspective, degree of psychological 
sophistication, degree of anxiety experienced, 
and felt dissatisfaction as important variables. 
Other reported studies on the role of social 
class place a greater importance on socio- 
economic status than was evidenced in the 
present study. 

Undoubtedly premature termination is also 
a function of the therapist and the type of 
therapy offered. The therapist—patient rela- 
tion and the competence of the therapist may 
both relate to length of stay in treatment. It 
would also seem probable that in order to 
reduce the rate of dropout, therapeutic pro- 
cedures must be developed for handling the 
relatively uneducated, poorly motivated, and 
psychologically unsophisticated patient. Per- 
haps special orientation procedures are needed. 
Possibly drugs should be used as adjunctive 
treatment. In any case, premature termination 
in large numbers suggests that present clinic 
practice does not meet the needs of a con- 
siderable proportion of patients referred for 
treatment. 


Summary 


The aims of the study were to check the 
validity of a predictive test battery and to 
test six research hypotheses concerning char- 
acteristics that differentiate patients who ter- 
minate prematurely from those who remain 


Table 3 


Number of Terminators (T) and Remainers (R) 
Correctly Classified in Cross-Validation 
Sample B when Cutting Score 
is Optimum 





Actual classification 





classification R 
T 22 78 
R 34 


Total 





56 126 
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for six months or longer. Data on five tests 
were collected on 300 cases seen in 13 clinics. 
Scores based on the a priori keys did not dif- 
ferentiate the groups at an acceptable level of 
significance although all were in the expected 
direction. However, the item analysis resulted 
in three subscales that did tend to support 
four of the hypotheses. The best three sub- 
scales and the social worker’s prediction com- 
bined configurally correlated .43 with the 
Terminator—Remainer criterion in the cross- 
validation sample. Compared to Terminators, 
Remainers tend to be more anxious, more self- 
dissatisfied, and more willing to explore per- 
sonal problems with others. Remainers also 
are less likely to have a history of antisocial 
acts and are more dependable, more con- 
trolled, and more persistent in tasks under- 
taken than Terminators. 


Received November 1, 1957. 
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In the book Explorations in Personality 
(4), a study relating hypnotizability to per- 
sonality needs is reported. Hypnotizability 
was found to correlate .43 with need for De- 
ference and — .44 with need for Autonomy. 
The Sway Test has been suggested as a pre- 
dictor of hypnotizability (3) and has been 
found to load highly on a factor of “primary 
suggestibility” (2). 

The Sway Test was used as part of two 
other studies going on at this institute. The Ss 
were student nurses. Of the Ss who had taken 
the Sway Test, 27 had also taken the Ed- 
wards Personal Preference Schedule (EPPS) 
(1), a test of manifest needs based on the 
Murray system; and 19 Ss had also taken the 


Thematic Apperception Test (TAT) which 


had been scored by Murray’s need system. 
The authors decided to compare relevant per- 
sonality needs of high and low groups on the 
Sway Test on the assumption that suggesti- 
bility is related to dependency traits. Sway 
tendency was measured in inches of positive 
sway on two trials. The group was divided at 
the median into high and low swayers. 

Although a number of need scores were ex- 
amined in comparing the two groups, three 
are directly relevant because of the White 
study: Deference, Autonomy, and Succorance. 
Succorance and Deference seem to measure 
“Dependency” while Autonomy would define 
the other pole of this dimension. The low 
swayers scored significantly higher on Au- 
tonomy on the EPPS than the high swayers 

1An extended report of this study may be ob- 
tained without charge from Marvin Zuckerman, In- 
stitute of Psychiatric Research, 1100 West Michigan 
Street, Indianapolis 7, Indiana, or for a fee from the 
American Documentation Institute. Order Document 
No. 5581, remitting $1.25 for photocopies and $1.25 
for microfilm. 


(p below .01). The other differences were not 
significant. The Autonomy score on the EPPS 
indicates endorsement of statements which 
express a desire “. . . to be independent of 
others in making decisions, to feel free to 
do what one wants .. . to avoid situations 
where one is expected to conform .. .” (1, 
p. 5). 

The high swayers scored significantly higher 
on Succorance attributed to the hero in TAT 
stories (p below .05) and higher on Succor- 
ance attributed to all characters in stories (p 
below .05). The other differences were not 
significant. A need for Succorance on the TAT 
indicates the number of instances where the 
S saw the character in the story asking for or 
receiving help, sympathy, or support from an- 
other individual or when the character is dis- 
turbed over loss of a source of love and sup- 
port. 

The findings from the two personality tests 
are congruent and indicate that a person who 
is suggestible is likely to be a person with 
strong dependency needs, while a person who 
resists suggestion is more liable to have 
stronger needs for independence or autonomy. 


Brief Report. 
Received March 19, 1958. 
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as a Function of Initial In-Therapy Behavior’ 
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In a previous paper (4), the authors re- 
ported that certain differences in client per- 
sonality structure at the outset of therapy ap- 
pear to be significantly related to differences 
of length and outcome in client-centered ther- 
apy. The present study investigates the hy- 
pothesis that the manner in which clients ini- 
tially present their problems and their initial 
in-therapy mode of approach to the resolu- 
tion of those problems are also significantly 
related to differences in length and outcome. 


Methods and Procedure 
The Instrument 


On the basis of the preceding hypothesis, 
five descriptions of initial in-therapy client 
behavior were formulated, and each desig- 
nated by one of the letters, V, W, X, Y, or Z. 
A complete statement of each of these de- 
scriptions follows: 


V: Immediately deals with a feeling-in-relationship 
problem and has already somewhat localized a rather 
specific source or area of difficulty. Deals very much 
with what he says and does, how he acts and feels 
in situations, and discusses the interpersonal effects 
of these. He is quite internally focused and has a 
very strong and very apparent drive to generate and 
examine impulses, thoughts, ideas, despite resultant 
fear, guilt, sadness, etc. The person orients himself 
in the therapy situation as if saying: “This is my re- 
sponse to such-and-such and this is the kind of situa- 
tion in which I find myself; now assuming that I 
somehow contribute to this situation, I want to alter 


1 This investigation is based upon a dissertation 
submitted by the senior author to the University of 
Chicago in partial fulfillment of the reouirements for 
the degree of doctor of philosophy. It was supported 
in part by research grants from the National Insti- 
tute of Mental Health, National Institutes of Health, 
U. S. Public Health Service, and from the Ford 
Foundation. 


my responses and behaviors and resolve or diminish 
the disturbances I feel.” 

W: Immediately deals with feeling-in-relationship 
problems but has not clearly differentiated a specific 
source of difficulty. That is, relational difficulties are 
perceived, but the feelings involved are not under- 
stood; no clear connection with the situation is ap- 
parent. There is strong drive toward clearer differ- 
entiation and understanding of where the source of 
interpersonal or self-disturbance lies and how it oc- 
curs. There is as well a general concern to discover 
just how and where in his disorders he himself is 
contributing, and effort is spent in driving himself to 
look at generated impulses for his contributions to 
his undesirable life situations so that he might change 
and resolve his disturbances. 

X: Vacillates between dealing with relationship 
problems and discussing externals and listing attri- 
butes of situations and of others. He may openly 
exhibit emotional behavior: eg., depression, cryirz, 
fear, anger, etc. He may give way to a mood or feel- 
ing experience of the moment but usually expresses 
or discusses these feelings in terms of external cause 
rather than in terms of “this is my response to such 
and such” (external); though, to repeat, there is 
usually a vacillation between expression of mood or 
emotion and discussion of it in terms of internal and 
external causes. In a word, he vacillates between ex- 
pressing himself about basic feeling and interaction 
problems and the manifestations of those problems 
in very general terms without definite pointing to 
his own contributions to situations and his responsi- 
bility in them; and yet shows definite indications 
now and then of feeling that he is in some way a 
contributor and has some responsibility for the 
situations. 

Y: Does not deal with feeling-in-relationship prob- 
lems but discusses external manifestations of internal 
difficulties; or discusses feelings as if they are ex- 
ternal objective things to be intellectually named, 
labeled, or categorized (eg., “I use a lot of atten- 
tion-getting mechanisms in social situations”). There 
is a rigorous intellectual control of impulses so that 
they are shunted into structured categories and ex- 
planatory generalities which seem to give a measure 
of momentary satisfaction and comfort. Such deal- 
ing, shunting, and applying of generalities appear to 
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Table 1 


Results of Application of the Five Descriptions to First Therapy Interviews of Twenty-four Cases for Which 
Both Length of Therapy and Therapist Success Rating Were Known to One of the Judges 








Short failure 
(12 or less 
interviews) 


Failure zone 
(13-21 
interviews) 


Short success 
(12 or less 
interviews) 


Long success 
(More than 21 
interviews) 


Long failure 
(More than 21 
interviews) 
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NNNBMNNKK 


Vv Ww xX 
V W 
V X 

Ww 





represent problem-resolving activity and are usually 
followed by a similar procedure about another be- 
havioral area. There is strong avoidance of discus- 
sion on the feeling level in relationships, but listing 
of attributes of feeling life may be done intellectually 
and analytically. 

Z: Deals with problems as though they are almost 
entirely external to him. The localization of source 
of difficulties is seen to be vaguely “beyond” the 
person, the focus being quite external. His approach 
is almost that of listing attributes of people and 
situations, with very little self-responsibility under- 
stood. It is as if at times the person is attempting to 
resolve his difficulties by explaining the people with 
whom he is involved, disturbed about, guilty toward, 
etc. It is sometimes as if he is saying: “Things should 
be different, and if they were, I'd be all right.” He 
often appears to be describing the various facets of 
disturbing situations and relationships as if they are 
entirely outside him and as if then waiting for some- 
thing to be done about them. Since he focuses the 
problems outward, he seems to be asking: “What 
can be done to change this so that I won’t feel dis- 
turbed?” The therapy situation thus seems to be tri- 
polar: the client, the problems, the counselor. There 
is avoidance of discussion of internal feelings in re- 
lationships, even though feeling may be apparent in 
voice tone, gesture, words used, etc. 


Subjects 


The total sample for this study is com- 
posed of 42 cases, all seen by therapists at 
the Counseling Center, The University of 
Chicago, during the period 1949-54. They 
were participants in various research programs 
being carried out at the Counseling Center at 
that time. The sample was composed of 21 
students, 21 nonstudents. There were 20 fe- 
males and 22 males. The mean age was 27.9 





years; the age range was from 19 to 41 years. 
Forty-one of the cases had male therapists; 
in one case the therapist was female. 

Twenty-four cases from the total sample 
represent only the extremes on a therapist 9- 
point success rating scale—that is, ratings of 
1 through 4 being failures, and 7 through 9 
being successes. The remaining 18 cases of the 
total sample represent the more middle range, 
points 5 and 6, of the 9-point rating scale, as 
well as its extremes. 


Procedure 


The instrument described in a previous sec- 
tion of the present paper was applied to the 
first therapy interview only of each of the 
42 cases constituting the sample. Both length 
of therapy and therapist success rating were 
known by the senior author (who acted as 
one of the judges) for 24 of the 42 cases. Of 
these 24 cases, 14 were examined and rated 
jointly by him (Judge I) and another judge 
(II). The remaining 10 of these 24 cases were 
rated independently by the two judges. 

For the remaining 18 cases of the total 
sample, neither judge knew length of therapy 
or therapist success rating for any case.’ Ap- 
plication of the descriptions to 14 of these 18 
initial therapy interviews was made independ- 
ently by each judge. Judge I alone applied 
the instrument to the remaining 4 cases. 


2D. S. Cartwright collected the interview material 
and coded it for use by the two judges. 
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Results 


Joint rating of 14 of the 24 cases for which 
the senior author alone knew both length of 
therapy and therapist success rating resulted 
in the two judges reaching immediate con- 
census. Independent application of the in- 
strument to the remaining 10 of these 24 cases 
resulted in perfect agreement between judges. 

Table 1 presents the results of application 
of the instrument to these 24 cases grouped 
according to length-by-outcome. 

The foregoing applications indicate the 
following predominant relationship between 
client groups and instrument descriptions: 
(a) the failure-zone group may be mainly 
characterized as Y, with two exceptions which 
are Z and X; (0) the short failures may be 
characterized as Z and Y, with one exception 
which is an X; (c) the short successes may 
all be characterized as V; (d) the long suc- 
cesses may be characterized as W, with one 
exception which is an X; and (e) the single 
long failure is described as X. 

Since the block of 18 cases represented both 
the middle range and the extremes of the suc- 
cess—failure scale, it was necessary to redefine 
prediction points for success and failure with 
respect to these cases. Cartwright (2) had 
demonstrated the mean success rating for the 
population of clients terminating therapy at 
the Counseling Center, The University of Chi- 
cago, to be approximately 5.00 on the thera- 
pist 9-point success rating scale. Thus, for the 
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purposes of classifying these 18 cases, the 
writers chose to define therapeutic success as 
a rating of 6 or more on that scale, and fail- 
ure as a rating of 5 or less. 

Table 2 presents the results obtained from 
the judges’ application of the descriptions to 
the first therapy interviews in the block of 
18 cases, within the framework of the actual 
length-by-outcome groupings. 

Inspection of the data shown in Table 2 
indicates that the two judges agreed on 13 of 
14 ratings with respect to application of de- 
scription to client, a result significant at bet- 
ter than the .005 level of confidence. 

The results also indicate the following pre- 
dominant relationships between descriptions 
and clients: (a) Short failures may be char- 
acterized as Z or Y, with one exception which 
is an X; (6) short successes were not pre- 
dicted, though two are given such actual rat- 
ings by the therapist, being rated by the two 
judges as short failures, Z; (c) long successes 
may be characterized as W or X, with one ex- 
ception which was rated by both judges as a 
short failure, Z. Comparison of these findings 
with those in the previous block of 24 cases 
indicates that, with a few important excep- 
tions, the predominant relationships between 
instrument descriptions and length-by-out- 
come groupings appear to hold. 

Something should be said concerning the 
important exceptions mentioned in the pre- 
ceding paragraph. First, with respect to the 


Table 2 


Results of Application of the Five Descriptions to First Therapy Interviews in Eighteen Cases, 
within the Framework of Their Actual Length-by-Outcome Groupings 


Short failure 
(12 or less 


interviews) 


Judge I Judge IT 


Short success 
(12 or less 
interviews) 


JudgeI Judge II 


Long success 
(More than 21 
interviews) 


Judge I Judge II 
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two cases rated by the therapists as short 
successes and by the judges as short failures, 
perhaps little more can be said than that the 
criteria used by the therapists in these cases 
must have been markedly different from those 
employed by the judges. A reauditing * of the 
initial interview in these two cases results in 
the same conclusion as that given in the first 
place—-namely, that within the framework of 
the instrument developed for this study, no 
other prediction than short failure appears 
possible with respect to these two cases. 

It should be remembered in this connection 
that the instrument predicted the length of 
therapy quite adequately in both cases. With 
respect to length of therapy, no cases fell 
within the failure zone (13-21 interviews) ; 
only two cases fell within the short success 
category, and there were no long failure cases 
in this subsample. In order to demonstrate 
statistically something of the predictive value 
of the instrument with regard to length of 
therapy, the subsample of 18 cases was di- 
chotomized on the length dimension. Since 
Cartwright (2) had also demonstrated that 
approximately 50% of the clients entering 
therapy at the Counseling Center leave ther- 
apy at or before the 15th interview and ap- 
proximately 50% remain in therapy for 16 or 
more interviews, these divisions were chosen 
to form the dichotomy. A two-by-two table 
formed on these two divisions of length shows 
that only one case was incorrectly placed; 
the result is significant at better than the .002 
level of confidence. A comparable table formed 
on the dichotomy of therapeutic success (rat- 
ings of 6 or more on the 9-point scale) and 
failure (ratings of 5 or less on the 9-point 
scale) shows that only three cases were in- 
correctly placed; the result is significant at 
better than the .015 level of confidence (3). 

Of considerable importance is the distribu- 
tion of the X descriptions throughout the 
sample. A number of cases were designated 
as X, and each length-by-outcome group con- 
tains one or several of those so designated. 
Since the instrument did in general differ- 


8 Perhaps it should be stated that an original audit- 
ing and fitting of a description requires approximately 
two and one-half to three hours. Acoustical equip- 
ment now in common use would probably reduce 
the time considerably. 


William L. Kirtner and Desmond S. Cartwright 


entiate correctly the length-by-outcome group 
into which each case should be placed, the 
evidence is clear that the X description is not 
adequately defined for a specific and particu- 
lar task. Such a conclusion derives from the 
observation that cases rated as “long therapy 
X” are almost invariably rated by the thera- 
pist as successful, whereas those rated “short 
therapy X” are almost invariably rated by 
the therapist as failures. The basis for this 
distinction is not clear. Perhaps further in- 
vestigation of these sorts of matters will lead 
to descriptive behavioral differences on the 
basis of which clearer and more reliable pre- 
dictions can be made. No further resolution 
of rating discrepancies apparent in this study 
can be made at the present time. 


Discussion 


Following examination of factors such as 
age, sex, status (student-nonstudent), and 
therapist, in relation to length and outcome 
of therapy, Cartwright concluded that (1, p. 
362) “. . . certain individual differences be- 
tween clients . . . give rise to different kinds 
of therapeutic process.” In addition, he hy- 
pothesized that the failure-zone may be char- 
acterized by (1, p. 363) “a drastic behavioral 
manifestation of resistance,” suggesting that 
“If such a view were confirmed by further re- 
search, it would raise seriously the question 
of whether or not client-centered therapists 
should modify their approach to therapy to 
include the possibility of temporarily directive 
behavior at a point during this critical failure 
zone when it seems likely that a client will 
leave.” 

The present study supports Cartwright’s 
hypothesis that individual differences between 
clients account, at least in part, for different 
kinds of therapeutic process. However, as a 
result of the present study, it would seem 
that earlier rather than later in-therapy modi- 
fication of therapist approach might be de- 
sirable or required if failure-zone as well as 
short failure cases are to remain in therapy. 
Whether or not such modification of therapist 
approach should be toward “directive” opera- 
tions remains unclarified. 

In contrast to the earlier hypothesis con- 
cerning resistance during the critical failure- 
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zone period, the present study suggests that 
the resistance may be identifiable at the out- 
set of therapy, though such a suggestion does 
not rule out the possibility of increasing re- 
sistance and drastic display of it in later pe- 
riods of therapy. 

It seems reasonable to suppose that the 
manner in which the client conceptualizes and 
attempts to resolve his problems will have 
much to do with whether or not he achieves 
resolution. In client-centered therapy as prac- 
ticed today, there is no place for activity on 
the part of the therapist specifically directed 
toward helping the client to change an ineffec- 
tive mode of approach to the resolution of his 
problems. Persistent understanding and ac- 
ceptance of a “resistant” mode of approach 
may well do little more than reinforce that 
approach. 

Whether modes of approach to problems are 
indeed modifiable at all is, of course, an open 
question waiting upon controlled investiga- 
tion. But the writers conclude that in subse- 
quent study of the therapeutic process, spe- 
cial attention must be paid to that aspect 
of interaction between client and therapist; 


namely, to the client’s approach to his prob- 
lems and the therapist’s responses to that 
approach. 


Summary 


The first therapy interviews of 42 clients 
seen by client-centered therapists were ex- 
amined through the application of an instru- 
ment describing five differing modes of client 
initial approach to the resolution of personal- 
life difficulties. 

Results from two independent judges showed 
that the descriptions differentiated rather 
clearly the length-by-outcome groups consti- 
tuting the total sample of the study. A discus- 
sion of these results followed, and it was con- 
cluded that the manner in which clients ap- 
proach and conceptualize their problems will 
have much to do with whether or not resolu- 
tion is achieved. 


Received July 31, 1957. 
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Recent research has dealt with the relation- 
ship between personality factors and field in- 
dependence. Witkin et al. have reported that 
“the personality characteristic most closely 
related to field dependence or independence 
seems to be the tendency toward active coping 
with or passive submission to the environ- 
ment” (2, p. 474). Field independent persons 
were described as more analytical and more 
capable of mastering environmental forces 
than field dependent persons who were charac- 
terized as passive and submissive toward au- 
thority. Wertheim and Mednick (1) reported 
a significant relationship between field inde- 
pendence and need achievement. The present 
study attempted to test the above findings 
using objective paper-and-pencil tests. It was 
hypothesized that field independence is posi- 
tively correlated with needs Achievement, 
Autonomy, Dominance, and Intraception and 
negatively correlated with Succorance. 

Sixty-nine undergraduate Ss (57 female, 12 
male) were administered the Edwards Per- 
sonal Preference Schedule (EPPS) and the 
Thurstone adaption of the Gottschaldt Em- 
bedded Figures Test (EFT). On the EFT, S 
was shown 18 simple geometric designs, each 
followed by four complex designs with in- 
structions to find the simple design in the 
complex designs. S’s score on the EFT was the 
number correct in the 20 minutes alloted for 
the task. 

Product-moment correlations were obtained 


1 An extended report of this study may be obtained 
without charge from David Marlowe, Ohio State 
University, Room 304, Arps Hall, Columbus 10, Ohio, 
or for a fee from the American Documentation In- 
stitute. Order Document No. 5692, remitting $1.25 
for microfilm or $1.25 for photocopies. 


between scores on the EFT and the scores on 
each of the 15 needs measured by the EPPS. 
Ss’ scores on the EFT ranged from 9 to 34 
with a mean of 21.3. The greater the S’s score 
on the EFT, the greater the field independence. 
Of the five needs hypothesized to be rele- 
vant, only two yielded significant correlations 
with field independence: Intraception, .34 (p 
<.01) and Succorance,—.30 (p < .02). It 
can be concluded that in the present ssmple, 
field independence is associated with the need 
to be analytical in regard to the behavior and 
motives of one’s self and others (Intracep- 
tion), and with a relative absence of passive- 
dependent needs (Succorance). These results 
offer only partial support for Witkin’s find- 
ings. Most noteworthy is the failure of Au- 
tonomy and Dominance to yield significant 
correlations. These needs may be considered 
similar to Witkin’s “active coping” and “mas- 
tery of environmental forces” factors. The 
failure to obtain results similar to Wertheim 
and Mednick’s for need Achievement may be 
due to differences between the studies in meas- 
uring instruments. In particular, n Ach as 
measured by fantasy materials may not be 
equivalent to n Ach as measured by the 
EPPS. 
Brief Report. 
Received April 28, 1958. 
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Our research group has as its major interest 
the study of the process of psychotherapy, 
that is, the analysis of the interaction between 
patient and therapist (2, 3, 6). One im- 
mediate goal is the testing of several specific 
hypotheses concerning the relationship of the 
therapist’s interpretive activity to the mani- 
festation of resistance in the patient. Due to 
the relative recency with which interest in this 
area of therapy research has developed, how- 
ever, we have found it necessary to digress in 
order to solve some of the measurement and 
methodological problems involved in opera- 
tionalizing our variables (1, 5). This paper 
reports a further research concerned with the 
efiects of varying conditions of rating and 
classes of raters upon one of our key variables, 
Depth of Interpretation. 

In a previous study clinical psychologists 
made ratings of therapy interviews on Depth 
of Interpretation under varying conditions of 
presentation, context, and unit size. In that 
study, as in the present one, interpretation was 
defined as “any behavior on the part of the 
therapist that is an expression of his view of 
the patient’s emotions and motivations .. .” 
(4, p. 247). The greater the disparity between 
the view expressed by the therapist, and the 
patient’s awareness of these emotions and 
motivations, the deeper the interpretation. A 
seven-point scale for rating Depth of Inter- 
pretation was developed by the Method of 


1 This study was carried out under the auspices of 
United States Public Health Service Grant M-516, 
“Analyses of Therapeutic Interaction,” E. S. Bordin 
and R. L. Cutler, Principal Investigators. 

2 Now at Children’s Hospital, Philadelphia, Penn- 
sylvania. 


Equal Appearing Intervals from 70 statements 
descriptive of therapist behavior. An analysis 
of variance revealed: (a) Raters were able to 
apply this particular scale so as to distinguish 
between interviews; (4) a single, over-all rat- 
ing of an interview led to a deeper rating than 
that obtained from the mean of response-by- 
response ratings; (c) ratings made from type- 
scripts did not differ from those based upon 
tape recordings. 

All of these findings supported our expecta- 
tions. Contrary to what we anticipated, how- 
ever, there were no differences ascribable to 
varying amounts of context. That is, it made 
no difference whether the raters were given the 
immedia‘«'y preceding interview in its entirety 
before rating the crucial interview, or whether 
they made their ratings using only the thera- 
pist’s responses, with all patient material de- 
leted. 

It was primarily because we were concerned 
about this inability of our raters to utilize in- 
creasing amounts of contextual information to 
increase their interjudge agreement that the 
present study was designed. We first sought 
logically to identify those elements in the 
stimulus material which were potential con- 
tributors to the variance in rating Depth of 
Interpretation. The first of these hypothetical 
contributors, Element A, may be thought of as 
consisting of valid cues from the stimulus ma- 
terial which provide evidence as to the level of 
the patient’s awareness of his emotions and 
motivations, as to the therapist’s expression of 
his view of the patient’s emotions and motiva- 
tions, and, hence, as to the disparity between 
them. Thus, Element A represents the “true” 
variance of Depth of Interpretation. The 
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second, Element B, consists of a set of cues 
which lead to reliable judgments based upon 
some implicit, absolute conception of what 
constitutes a deep or shallow interpretation. 
The third, Element C, consists of irrelevant 
or interfering cues from either the patient’s or 
the therapist’s comments, which serve to mis- 
lead or confuse the raters. A fourth element, 
which is relatively independent of the stimu- 
lus material, but which interacts with Ele- 
ments A, B, and C in the total rating process, 
is the sensitivity to the various proportions of 
A, B, and C in the stimulus material which is 
available in the particular judge population. 

It is immediately apparent that Element A 
should be present to a greater degree in a 
situation where stimulus material is presented 
with more, rather than less, context. Element 
B may or may not be increased by increasing 
the amount of contextual material available to 
the judges during the rating process. Element 
C is almost certainly increased, since the 
judges are asked to maintain an awareness of 
considerable additional material, and must, in 
addition, decide whether a given bit of stimu- 
lus material is valid and relevant (Element 
A) or irrelevant (Element C). 

We were thus led to consider the possibility 


Table 1 


Assignment of Raters to Experimental Conditions* 








Typescript (TY) Recording (RE) 





Therapist 
Only 
(TO) 


Preceding 
Interview 


(PI) 


Therapist 
Only 
(TO) 


Preceding 
Interview 


(PI) 





A,1,3,4 

A, Il, 3,4 

A, III, 1 

F, IV, 

F,V,4 > 

F, VI, 4,2 

A, VII, 2,3 

F, VIII, 2,3 F, VII, 4,4 


See et 
<4e i", 


<<<< 
= ee 


D> > > D> 
~~ 


I,3,1 


’ 





Note.—The latin square was originally designed to pair one 
analyst and one fledgling in each of the four order—condition- 
case sequences. However, due to difficulties in obtaining quali- 
fied raters, it was necessary to modify the design as above. 

* This table may be recomposed into a modified latin-square 


lesign. 

» Key to notation. Capital letters refer to Analyst (A) or 
Fledgling (F). Roman numerals indicate judges’ identifying 
numbers. First arabic numeral in sequence indicates interview 
number. Second arabic numeral in sequence indicates order in 
which the particular judge performed this segment of his 
rating task. 
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that the judge population was crucial in this 
particular rating process, and that a particular 
kind of sensitivity to the nuances of thera- 
peutic interaction was necessary to enable 
judges to capitalize upon the additional con- 
textual material to increase their interjudge 
reliability. Given adequate training in the per- 
ception of cues in the patient’s and therapist’s 
responses relevant to the rating of Depth of 
Interpretation, and in the disregarding of ir- 
relevant or confusing cues, our judges should 
demonstrate more convincing interjudge agree- 
ment than had been found in the earlier study. 


Procedures 


Accordingly, we enlisted as expert judges in 
this study four psychoanalysts and four psy- 
chiatrists who had completed their personal 
analyses and were undergoing control training 
in psychoanalysis, but who had not yet been 
admitted to full standing in either the local or 
national analytic societies. The former group 
are hereafter called analysts; the latter, for 
want of a better term, fledglings. 

Our rationale for the selection of this par- 
ticular judge population was as follows: even 
though our clinical psychologist judges had 
had experience as psychotherapists, only a few 
had completed a personal analysis, and none 
had had the benefit of psychoanalytic training. 
We felt that the emphasis in analytic training 
upon the development of an acute self-aware- 
ness, with its consequent increased perceptual 
and interpersonal sensitivity and objectivity, 
might better enable the analysts to judge the 
degree of the patient’s awareness of his emo- 
tions and motivations, to assess the therapist’s 
statements concerning his view of the patient’s 
emotions and motivations, and hence more 
reliably to rate Depth of Interpretation. These 
judges should also be better equipped to weed 
out irrelevant cues, and perhaps to be less led 
astray by invalid cues of Element B. In short, 
we hoped that the analysts and fledglings 
would be better able to capitalize upon in- 
creased amounts of contextual information to 
increase their interjudge agreement by making 
maximal use of Element A and by limiting the 
distracting effects of Element C. 

In addition, by including the fledglings, we 
hope to assess the effects of varying amounts 
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Table 2 


Breakdown of Sums of Squares and F Tests for Analysis of Variance 


Sum of 


Source of variation squares 


Mean 
square F 





4.26 
5.46 


Sequences 
Raters within sequences 


Total between raters 9.72 
Order 
Interviews 
Experimental conditions 
Presentation 
Context 
Presentation X Context 


Pooled error 
Total within raters 


Grand total 


42( 33.02** 
36: 31.74** 


1. 
1. 


14.95** 
2.16 





** Significant at the .01 level 


of training and experience within the group 
which had had the benefit of formal analytic 
training. 

A modified latin-square design was pro- 
jected which was essentially a replication of 
the design of our earlier study. It permitted 
the systematic variation or control of two 
levels of context (therapist responses only vs. 
preceding interview) and two methods of 
presentation (typescript vs. tape recording). 
It also enabled us to assess the effects of prac- 
tice, order of presentation of the interviews, 
judges, and differences due to the interviews 
themselves. In addition, it was possible to ob- 
tain a rough estimate of the effect of experi- 
ence within this judge population (analysts vs. 
fledglings) by means of a ¢ test, although this 
difference is confounded with order and condi- 
tions. In the actual process of rating, each 
judge rated each of the four interviews re- 
sponse-by-response on the seven-point Depth 
of Interpretation rating scale (Table 1). 


Results 


The results of the analysis of variance are 
presented in Table 2. A word should be said 
at this point about this analysis. We were 
concerned with the effect of the varying condi- 
tions of rating upon the means of the response- 
by-response ratings in an interview. These 


means are extremely stable, and thus the total 
sum of squares is extremely small. In order to 
be conservative, we included the second and 
third order interactions in the error term. 

The analysis of variance showed that, as in 
the previous study, this second judge popula- 
tion was able to apply the Depth of Interpre- 
tation rating scale to differentiate between 
interviews. By far the greatest amount of the 
total variance was contributed by the judges; 
neither order of presentation nor varying 
amounts of context contributed significantly to 
the variance. However, the context by presen- 
tation interaction approached the .05 level of 
significance. 

A second portion of our data analysis was 
directed toward the interjudge agreement or 
rating reliability across and within the varying 
conditions. A summary of the interjudge re- 
liabilities by ‘cases is presented in Table 3. 
The product-moment correlations are remark- 
ably constant across Cases I, II, and III, but 
in Case IV there is a significant drop in inter- 
judge reliability.* A possible explanation for 
this difference lies in the fact that while Cases 


8 Significance of difference between ranges of cor- 
relations was assessed in every case by the Median 
test, a nonparametric statistic which does not require 
the assumption of underlying normality in the popu- 
lation of correlations (7). 
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I, II, and III are counseling or psychotherapy 
sessions with normal or neurotic subjects, 
Case IV is from a series of interviews between 
a schizophrenic patient under insulin treat- 
ment and an inexperienced psychotherapist. 

In Table 4 are presented the interjudge re- 
liability coefficients by conditions of presenta- 
tion and context. Significantly higher inter- 
judge agreement is found when the analyst- 
fledgling judges rate under identical conditions 
of presentation and context than when ratings 
are made under totally different conditions. 
However, the rather startling result is re- 
vealed that when this group of judges had ad- 
ditional contextual material at its disposal, 
the interjudge reliabilities were significantly 
lower than when they made their ratings on 
the basis of information from the therapist’s 
responses alone. When the results for the 
analysts and fledglings are viewed separately, 
the number of correlations becomes so small 
that statistical significance is not obtained, 
but the trend is equally apparent in both the 
analysts and the fledglings and cannot be con- 
sidered to be the result of the difference in 
experience. In addition, agreement was slightly, 
although not significantly, better when the 
ratings were based upon typescripts rather 
than upon tape recordings. 

Data from the earlier study with the psy- 
chologists were further analyzed to permit the 
assessment of the effects of varying cases and 
conditions of presentation and context upon 
their ratings. The results of these analyses are 
presented in Tables 5 and 6. Once again, Case 
IV is seen to yield consistently lower inter- 
judge agreement, although the Median test 
reveals only the difference between Cases IIT 
and IV to be significant. There is no sig- 


Table 3 


Interjudge Reliabilities by Cases 
(Analyst-Fledgling group) 








Case Range Median 





I —.03 to .56 
II —.05 to .65 
Il .09 to .53 
IV —.15 to .48 





Note.—Cases I, II, and III show significantly (p < .05) 
higher interjudge agreement than Case IV. Cases I, II, and 
it do not differ significantly. 
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Table 4 


Interjudge Reliabilities by Conditions of 
Presentation and Context 


(Analyst-Fledgling group) 








Conditions 


Range Median 





*Identical 
>Totally different 


O01 to .56 37 
—.05 to .58 .26 


‘Identical context (TO) 
4Tdentical context (PI) 


05 to .65 32 
.00 to 46 .25 


*Identical presentation (RE) 
‘Identical presentation (TY) 


—.15 to .53 25 
.03 to .53 31 


Note.—lIdentical vs. totally different—p < .05. 
Therapist Only context vs. Preceding Interview con- 
text—p < .01. 
* Includes only those correlations between raters operating 
under the same conditions of context and presentation. 

> Includes only those correlations between raters operating 
under both different context and presentation conditions. 

* Includes those correlations between raters operating under 
the Therapist Only context condition, irrespective of method 
of presentation. 

* Includes those correlations between raters operating under 
the Preceding Interview context condition, irrespective of 
method of presentation. 

* Includes those correlations between raters rating record- 
ings, irrespective of context condition. 

‘Includes those correlations between raters rating type- 
scripts, irrespective of context conditions. 


nificant difference in interjudge agreement 
which can be attributed to different methods 
of presentation, and while ratings made under 
the preceding interview context condition yield 
slightly higher interjudge agreement than 
those made under the “therapist only” condi- 
tion, the difference is not significant. Addi- 
tionally, identical conditions of presentation 
and context produce no higher agreements 
among the psychologists than do totally differ- 
ent conditions. 

Finally, comparisons were made of the in- 
terjudge reliabilities within and among the 
psychologist, analyst, and fledgling groups. In 
general, we find higher interjudge agreement 
among the psychologists than among either 
the analysts or fledglings. When ratings are 
made under identical conditions of presenta- 
tion and context, the psychologists yield a 
range of interjudge reliabilities from .04 to .58, 
with a median of .42, while the comparable 
figures for the analyst group are — .01 to .56, 
median .22; and for the fledglings, .09 to .25, 
median .14. These differences among the 
groups, due to the small numbers of cases in- 
volved, are not significant. 

A comparison of the interjudge reliabilities 
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within each of the three groups’ ratings made 
under entirely different conditions of presenta- 
tion and context reveals a range of .07 to .62 
for the psychologists, with a median of .42; 
range, .35 to .42, median .36 for the analysts; 
range —.02 to .43, median .31 for the fledg- 
lings. Once again, these differences fail to 
reach statistical significance, probably because 
of the small number of correlations involved. 

The amount of agreement among the three 
groups was determined by a series of cross 
correlations. That is, the ratings of the analyst 
group were correlated with the ratings of the 
psychologist group, the analysts with the 
fledgings, etc. These results may be sum- 
marized by saying that in general, the psy- 
chologists agreed more with both the analysts 
and fledgling groups than the latter agreed 
with each other. 


Discussion 


This study supports our earlier evidence 
that raters are able to apply the Depth of 
Interpretation scale so as to differentiate be- 
tween interviews, and indicates further that 
levels of experience and exposure to formal 
analytic training do not substantially modify 
this ability. There is consistent evidence that 
the method of presentation of the interview 
material (typescript vs. recording) has no ef- 
fect either upon mean level of rating or inter- 
judge agreement. While the differences in level 
of interjudge agreement among the three 
groups are small, there is a consistent tendency 
for the psychologist group to agree among 
themselves to a slightly greater extent than 
either the analysts or fledglings. In addition, 
we find that our hypothesis concerning the ef- 
fect of the increased sensitivity resultant from 
analytic training upon the perception of cues 


Table 5 


Interjudge Reliabilities by Cases 


(Psychologist group) 





Case Range Median 





I — .04 to .62 37 
II .13 to .63 37 
Ill 31 to . 48 
IV —.05 to .73 .2€ 





Note.—Significant difference between III and IV, » < .01. 


Table 6 
Interjudge Reliabilities by Conditions of 
Presentation and Context 
(Psychologist group) 


Conditions 


Range Median 





04 to .58 Al 
.07 to .62 42 





*Identical 
>Totally different 


‘Identical context (TO) 
4Tdentical context (PI) 


.09 to .63 35 
—.05 to .64 A3 


*Identical presentation (RE) 
Identical presentation (PI) 


.04 to .59 Al 
.03 to .73 37 





Note.—None of the three comparisons reveals a significant 
difference. 


* Includes only those correlations between raters operating 
under the same conditions of context and presentation. 

> Includes only those correlations between raters operating 
under both different context and presentation conditions. 

* Includes those correlations between raters operating under 
the Therapist Only context condition, irrespective of method 
of presentation. 


4 Includes those correlations between raters operating under 


the Preceding Interview context condition, irrespective of 
method of presentation. 


* Includes those correlations between raters rating recordings, 
irrespective of context condition. 


‘Includes those correlations between raters rating type- 
scripts, irrespective of context conditions. 


provided by additional contextual information 
is not confirmed. Instead, we find the para- 
doxical result that among the analyst-fledgling 
group, additional context results in a signifi- 
cant decline in interjudge agreement. 

It may be possible to account for the gen- 
erally higher interjudge agreement among psy- 
chologists in terms of their greater familiarity 
with rating tasks of the sort demanded by 
these studies. This does not, however, explain 
why the analyst group was handicapped by 
the additional context. It is possible that when 
this group was faced only with the therapist 
responses, it was able to impose a generally 
more meaningful rationale upon the data than 
when the additional context provided a much 
larger number of alternative hypotheses. Per- 
haps with the fuller context of having reviewed 
all of the preceding interviews, the analyst 
group would tend to converge on a more 
unai. mous judgment. The possibility that 
analy uc training leads to a somewhat greater 
ability to entertain multiple alternative hy- 
potheses in the face of larger amounts of con- 
textual information should not be overlooked. 

Whatever may account for the few differ- 
ences between psychoanalysts and clinical psy- 
chologists in rating Depth of Interpretation, 
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it seems very clear that the two groups are 
essentially interchangeable as observers in 
studies which use this method of rating. Even 
though some may question the validity of this 
particular method of differentiating Depth of 
Interpretation, it is fairly representative of the 
commonly used methods. Therefore, one may 
proceed with the confidence that choice within 
this particular range of observers will not have 
profound influences on the results obtained. 

We continue to be troubled by the generally 
low reliability of the Depth of Interpretation 
scale, although we have previously demon- 
strated the possibility of increasing interrater 
agreement by means of specific instruction in 
the application of the scale, and by using rat- 
ings of somewhat larger segments of the inter- 
view. We continue to feel that the approach 
to psychotherapy research by means of re- 
sponse-by-response analysis is the most promis- 
ing of several alternatives, and are convinced 
that the variable Depth of Interpretation is a 
meaningful and useful one in psychotherapy 
research. A recent study by Speisman (8) 
lends support to this contention. 


Summary 


In order to explore the possibility that in- 
creased perceptual sensitivity to the subtleties 
of the therapy relationship would allow raters 
to increase their interjudge agreement in rat- 
ing Depth of Interpretation, four psycho- 
analysts and four analysts in training were 
enlisted as expert judges. These judges rated 
four therapy interviews under different condi- 
tions of presentation and context. 

An analysis of variance carried out on a 
modified latin-square design revealed that 
these raters could distinguish between inter- 
views, but that neither the method of presenta- 
tion of the interviews nor the amount of con- 
textual information available to the judges had 
any systematic effect upon the means of re- 
sponse-by-response ratings. 

A correlation analysis revealed that this 
group of raters had generally lower interjudge 
reliabilities than a group of psychologists who 
had previously rated the same interviews 
under the same conditions. The analyst-fledg- 
ling group among themselves showed signifi- 
cantly better interjudge agreement when their 
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ratings were made under conditions of minimal 
context (therapist only condition) than when 
increasing amounts of context were available 
(preceding interview condition). 

These results were discussed in terms of the 
greater familiarity of psychologists with rating 
tasks of the sort used, and the possibility was 
raised that analytic training made it possible 
to entertain a greater number of alternative 
hypotheses (and thus to decrease rating relia- 
bility) in the face of added contextual infor- 
mation. The essential identity of the two 
populations of judges was pointed out, and it 
was suggested that one may proceed with the 
confidence that choice within this particular 
range of observers will not have profound in- 
fluences on the results obtained in studies in 
which this and similar rating scales are used. 

The implications of the study for the con- 
tinued application of the Depth of Interpreta- 
tion rating scale to the process analysis of psy- 
chotherapy were discussed. 


Received September 9, 1957. 
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One of the problems which confronts most 
mental hygiene clinics is premature termina- 
tion of psychotherapy. Many patients go 
through all the admission procedures, begin a 
course of long-term therapy, but discontinue 
treatment after only a few sessions. 

Several articles have dealt with the possible 
causes of premature termination of therapy 
(2, 7, 9, 10, 18) and a number of attempts 
have been made to predict which patients will 
terminate (3, 12, 14, 15, 17, 22). Most of the 
prediction studies have been concerned with 
the personality of patients, as inferred from 
test records and biographical data, that might 
be related to their unwillingness to remain in 
treatment. Very few studies have dealt with 
characteristics of therapists which might be 
responsible in part for patients discontinuing 
therapy. 

The present study is an attempt to discover 
whether different types of therapists tend to 
lose or hold in treatment different types of 
patients. It is not concerned with differences 
in the drop-out rate for different therapists, 
but rather the ¢ype of patient most apt to con- 
tinue or discontinue treatment with various 
types of therapists. It is thus a study of pa- 
tient-therapist compatibility. It is therefore in 
line with the contemporary emphasis on field 
theory, which conceives of therapy as an in- 
terpersonal process in which the patient’s be- 
havior and growth are seen not as a function 
of the patient’s personality alone, nor of the 
therapist’s personality and skill alone, nor the 
mere sum of these, but is rather dependent on 
the particular nature of the interaction be- 
tween the patient’s characteristics and the 


1 This article is based on a doctoral dissertation 
submitted to the Department of Psychology, Univer- 
sity of Michigan, 1953. The author is now at Agnews 
State Hospital. 


therapist’s characteristics. The results of such 
studies could eventually be put to use for the 
purpose of selecting the therapist most suitable 
for each patient. At present, patients are as- 
signed to or selected by therapists at random 
or else on an intuitive basis rather than on the 
basis of empirical research. 


In studies reported previously (14, 15) dealing 
with the prediction of premature termination of psy- 
chotherapy by means of the Rorschach test, it was 
found that the total number of responses given 
turned out to be the best predictor. Three other 
studies have reported somewhat similar findings (3, 
12, 17). In a number of clinics, therefore, it appears 
that patients who drop out of treatment within a 
few sessions usually give fewer responses on the 
Rorschach than patients remaining in treatment for 
a longer period of time. 

A high response total on the Rorschach has been 
attributed to or found related to a number of per- 
sonality and attitudinal variables including: produc- 
tivity (5, 24), achievement (26), intelligence (3, 20), 
verbal fluency (24), ego-involvement in the task 
(8), cooperativeness (3), responsivity (6), ease at 
forming associations (1), conflict awareness (25), 
high energy level (6), and initiative (24). Low re- 
sponse total has been attributed to excessive use of 
repression (16, 25), depression (5), incapacitating 
anxiety (11), passivity (24), a tendency to give up 
easily (24), and a hysterical indifference towards 
symptoms (24). 

The response total on the Rorschach seems thus to 
be related to a number of intellectual, temperamental, 
and motivational variables, all reflecting a type of 
productivity or drive which one would expect to be 
required for active participation in psychotherapy 
In fact some studies have reported statistically sig- 
nificant relationships between the number of Ror- 
schach responses given by the patient and ratings of 
improvement in psychotherapy (21, 23). 


Although for the clinic as a whole, patients 
who were unproductive on the Rorschach 
tended to drop out of therapy prematurely, it 
was noticed by the author that for a few 
therapists this general finding did not seem to 
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apply. Whether or not productivity on the 
Rorschach differentiated between patients re- 
maining and patients terminating seemed to 
depend on which therapist was doing the treat- 
ment. A series of studies was therefore carried 
out to determine whether different types of 
therapists tended to hold in treatment different 
types of patien’s and, specifically, whether 
some therapists are able to hold in treatment 
the unproductive or unmotivated patients who 
usually do not remain long in treatment, and 
whether some therapists lose many of the pro- 
ductive and motivated patients who usually 
remain in treatment. 


Subjects and Setting 


The present study was carried out in the VA 
Mental Hygiene Clinic at Detroit, Michigan. 
The subjects are therefore veterans (almost all 
males) who applied for or were referred for 
psychotherapy at this clinic. All suck patients 
are entitled to free psychiatric treatment be- 
cause of a service connected disability. Many 
are referred for treatment after a routine pen- 
sion examination and often have the erroneous 
impression that their pension will be jeopard- 
ized if they do not report for treatment, and 
some have misgivings about the fact their 
pensions will be taken away if they recover as 
a result of treatment. 

A wide variety of diagnostic categories are repre- 
sented among the patients at this clinic, the most 
common being anxiety reaction, psychoneurosis, latent 
schizophrenia, and character disorder. Almost all the 
patients are from urban areas. Most are employed as 
skilled or unskilled factory labor; some hold clerical 
or sales positions; only a few are in the professions 
or are business executives. The socioeconomic level 
of the patients is predominately lower and lower- 
middle class. About half the patients had not com- 
pleted high school. Upon admission to the clinic the 
patients are interviewed and usually are given the 
Rorschach test and one or two other psychological 
tests. 

The therapists at this clinic are for the most 
part analytically oriented and tend to em- 
phasize long-term methods of therapy. Thus 
most patients remaining less than 20 sessions 
are considered to have terminated before com- 
pletion of the treatment. 


Procedure 


The records of patients seen by certain 
therapists were taken from the files and clas- 
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sified with respect to duration of therapy. As 
in the previous study (14), only extreme cases 
were used. Patients dropping out of treatment 
within five sessions without extenuating cir- 
cumstances or the therapist’s recommendation 
were considered to have terminated prema- 
turely. Those remaining at least 20 sessions 
met our criterion for remaining in treatment. 
In the Detroit VA Clinic, about 40% of the 
patients terminate \.ithin five sessions and 
about 30% remain in treatment 20 or more 
sessions. 

The patients were then further characterized 
in terms of their productivity on the Ror- 
schach test. Patients giving 25 or more Ror- 
schach responses were classified as “pro- 
ductive,” and patients giving less than 25 
Rorschach responses were classified as “un- 
productive.” 

A series of six statistical analyses was then 
carried out to determine whether the amount 
of difference in Rorschach productivity be- 
tween the remainers and terminators is re- 
lated to the following therapist variables: 
(a) unspecified differences between individual 
therapists, (6) the professional training of the 
therapist, (c) the sex of the therapist, (d) the 
warmth of the therapist, (e) the competence 
of the therapist, and (f) the passivity of the 
therapist. 

In order to characterize the therapists along 
these last three dimensions, ratings were ob- 
tained on all the therapists whose patients 
were used in this study. Ratings were made by 
the three staff psychologists who were best 
acquainted with a large number of therapists 
at the clinic. The ratings were made on a 
three-point scale in which each rater charac- 
terized each therapist as being average, above 
average, or below average on three different 
variables: warmth, competence at analytically 
oriented therapy, and passivity. The raters 
were instructed to assign an approximately 
equal number of therapists to each category 
on the rating scale. The statistical analyses 
used patients of therapists falling near the 
extremes of these continua as reflected by 
agreement by two out of three of the raters 
that the therapist was either above or below 
average, the third rater not being in direct 
contradiction. 

The following specific hypotheses related to 
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these rated characteristics of the therapists 
were formulated and tested: 

1. Therapists who appear warm and friendly 
are able to hold in treatment more of the un- 
productive or less motivated patients than 
therapists who appear somewhat cold and 
distant. Being with a warm and friendly per- 
son may be a source of comfort to unproduc- 
tive and, hence, probably inadequate indi- 
viduals, and may serve to motivate some of 
the patients who were not very motivated to 
undergo psychotherapy at the time they were 
tested. 

2. Therapists who are highly competent at 
analytically oriented therapy lose fewer well- 
motivated, productive patients than therapists 
who are less competent. Although a competent 
therapist may lose many unmotivated or un- 
productive patients, because such patients 
may not be suitable candidates for analytically 
oriented therapy, he should not lose many 
productive and well-motivated patients. There- 
fore, loss of such patients was held to be due 
to errors in therapeutic technique, which are 
likely to be made more often by the less 
competent therapists. 

3. Extremely passive therapists hold in 
treatment fewer of the unproductive patients 
than therapists employing a more active pro- 
cedure. It was felt that the unproductive pa- 
tients included many who were passive de- 
pendents and needed a relatively active thera- 
pist to lean on or else they would become 
discouraged and give up. 


Results and Discussion 


Unspecified Differences Between Individual 
Therapists 


In order to determine whether individual 
therapists differ in regard to the type of pa- 
tients they are able to hold in treatment, six 
therapists were selected who had been at the 
clinic a long time. A search of the files dis- 
covered, for each therapist, seven patients who 
remained in treatment at least 20 sessions and 
seven patients who terminated within five ses- 
sions. The number of Rorschach responses of 
each patient was noted and a nonparametric 
analysis of interaction was carried out in the 
manner recommended by Mood (19). The 
analysis permits eliminating any variability 


Table 1 


Median Number of Rorschach Responses Given by 
the Remainers and Terminators of 
Different Therapists 








Therapists 





Patients f B j 





Terminators 
Remainers 35 





in patients’ productivity associated with (a) 
the therapists’ selection of patients, as would 
be reflected in differences between the median 
number of responses of different therapists’ 
patients in general, and (5) over-all differ- 
ences in scores between remainers and ter- 
minators in general; thus leaving the sole 
source of variation the patient-therapist inter- 
action component. 

A significant amount of patient-therapist 
interaction was found (p= .01), which im- 
plies that individual therapists do differ in 
regard to the type of patients tending to con- 
tinue or discontinue treatment. 

Table 1 shows the median number of Ror- 
schach responses of the remainers and ter- 
minators of each of the six therapists. It will 
be noted that for two of the therapists there 
was no appreciable difference in Rorschach 
productivity between the remainers and ter- 
minators. For the other four therapists, how- 
ever, the patients who remained in treatment 
gave about twice as many responses as those 
terminating. Thus remainers and terminators 
differ in Rorschach productivity for certain 
therapists but not for others. Some therapists 
are able to keep the unproductive patients in 
treatment, other therapists are not. It follows, 
therefore, that whether or not Rorschach pro- 
ductivity can be used to predict continuance 
in therapy depends to some extent on the type 
of therapist involved. The over-all difference 
in Rorschach productivity between the re- 
mainers and terminator: for this particular 
clinic, as reported previously (14, 15), implies 
that most of the therapists in this clinic are 
like Therapists B, C, D, and F, rather than 
like A and E. Other clinics made up of thera- 
pists like A and E would probably find no 
significant difference in Rorschach produc- 
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tivity between patients remaining in treatment 
and patients terminating prematurely. These 
findings may account for some of the incon- 
sistencies in the reports from different clinics 
concerning their attempts to predict premature 
termination of therapy from the Rorschach 
(3, 4, 12, 15, 17, 22). 


Professional Training of the Therapist 


In order to determine whether the profes- 
sional training of the therapist makes any dif- 
ference in regard to the type of patients most 
apt to continue or discontinue treatment, 78 
patients (39 remainers and 39 terminators) 
were categorized according to the professional 
training of their therapist (i.e., psychiatrist, 
clinical psychologist, or psychiatric social 
worker) and a nonparametric analysis of in- 
teraction was carried out. The results of the 
analysis failed to reveal a significant amount 
of patient-therapist interaction. Apparently, 
whether a therapist is a psychiatrist, clinical 
psychologist, or psychiatric social worker 
makes little or no difference with respect to 
the Rorschach productivity of patients tend- 
ing to remain in treatment. 


Sex of the Therapist 


In order to determine whether the sex of 
the therapist makes a difference in regard to 
the type of patient most apt to continue or 
discontinue treatment, a nonparametric analy- 
sis of interaction was carried out in which pa- 
tients were categorized according to the sex 
of their therapist. For this purpose, 40 pa- 
tients seen by seven female psychologists and 
social workers were compared with 40 patients 
seen by seven male psychologists and social 


Table 2 


Median Number of Rorschach Responses Given by the 
Remainers and Terminators of Therapists 
Categorized with Respect to Sex 








Therapists 





Patients Male Female 





Terminators (NV = 40) 
Remainers (N=40) 


15.5 
30.5 


25.0 
22.5 
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workers. Patients seen by psychiatrists were 
not used since there were no female psy- 
chiatrists at the clinic. 

The result of the analysis of interaction 
was significant at the .05 level of confidence, 
suggesting that the sex of the therapist is a 
factor determining whether a given type of 
patient will remain in treatment or not. Table 
2 indicates the median number of Rorschach 
responses given by the patients remaining and 
the patients terminating treatment with male 
therapists and female therapists. 

It will be noted that productivity on the 
Rorschach is not related to continuance in 
therapy with the female therapists employed 
at this clinic, whereas it is associated with 
continuance for patients seen by male thera- 
pists. 

Of course it would be unwise to generalize 
this finding to female therapists in general 
since the records of patients seen by only 
seven female therapists, all employed in a 
single clinic, were used in this study. How- 
ever, it is clear that in the Detroit VA Clinic, 
at least, the female therapists were able to 
keep in treatment many of the patients giving 
few Rorschach responses, i.e., the type of pa- 
tients who tend to terminate when assigned to 
male therapists. But, it must also be noted 
that the female therapists also tend to lose 
proportionately more of the productive pa- 
tients than do the male therapists. 

The reason is not clear, but certain pos- 
sibilities may be suggested. The group giving 
few Rorschach responses might include many 
passive dependent or frightened individuals 
who crave a warm supportive relationship with 
a mother substitute and may feel threatened 
by father figures. Another possibility is that 
some patients not really adequately motivated 
for therapy might relish the opportunity to 
enjoy a psychologically close relationship with 
someone of the opposite sex and thus achieve 
partial gratification of erotic impulses. Still 
another possibility is that the female thera- 
pists might happen to be warmer and more 
friendly persons than the male therapists at 
this clinic. It would be interesting to find out 
whether the same relationships would hold 
true in other clinics. 
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Table 3 


Rorschach Productivity of Patients Remaining in 
Therapy with Therapists Categorized 
with Respect to Warmth 








No. of patients 
seen at least 
20 sessions by 
each type of 
therapist 


Proportion 
who gave 
less than 25 
Rorschach 
responses 


Kind of 
therapist 





Warm 32 
Cold 38 


63% 
24% 


Note.—t = 3.55, » = .0005. A one-tailed test was used to 
compute the significance level since the direction of the differ- 
ence had been predicted. 





Personality and Ability of the Therapist 


A series of three statistical comparisons was 
carried out designed to test the hypotheses 
presented in the procedures section. 

Since it was hypothesized that the warm 
and friendly therapists hold in treatment more 
of the unproductive patients than do the cold 
and distant therapists, the proportion of un- 
productive (R less than 25) patients among 
those remaining in treatment with the thera- 
pists rated most warm was compared with the 
proportion of unproductive patients among 
those remaining in treatment with the thera- 
pists rated least warm. 

The results shown in Table 3 confirm the 
first hypothesis. Of those patients who re- 
mained in therapy with cold therapists, 76% 
gave productive Rorschachs and only 24% 
were not productive, whereas of the patients 
who remained in treatment with warm thera- 
pists, 63° were not productive. Thus the 
cold therapists tended to keep in treatment 
only the productive well-motivated patients, 
whereas the warm therapists were able to 
keep in treatment a sizeable proportion of the 
unproductive or less motivated patients as 
well. In fact, the proportion of unproductive 
patients who remained in treatment with the 
warm therapists corresponds closely to the 
proportion of unproductive patients in the 
total intake population at this clinic. There- 
fore, if the patient is assigned to a warm and 
friendly therapist, productivity on the Ror- 
schach can not be used to predict whether or 
not he will remain in treatment. 

The fact that the warm therapists are able 
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to hold in treatment the unproductive patients 
as well as the productive ones may or may not 
be desirable. One would have to know some- 
thing about the outcome of the treatment in 
order to tell. Perhaps some patients who are 
really unable to benefit from analytically 
oriented psychotherapy will remain in treat- 
ment because they enjoy the warm and friendly 
behavior of the therapist toward them. The 
treatment of other patients, who could derive 
some real benefit from therapy, might thus be 
delayed or prevented, since each therapist can 
only see a limited number of patients. 

Since it was hypothesized that the most 
competent therapists lose fewer productive 
well-motivated patients than the least com- 
petent therapists, the proportion of productive 
patients among the terminators seen by the 
most competent therapists was compared with 
the proportion of productive patients among 
the terminators seen by the least competent 
therapists. The findings summarized in Table 
4 confirm the second hypothesis. Therapists 
rated as most competent at analytically 
oriented therapy do lose a significantly (p 
= .03) smaller percentage of productive well- 
motivated patients than therapists rated as 
least competent. 

The third hypothesis was not confirmed by 
the findings. Rated passivity of the therapist 
was not found to be significantly related to 
whether or not unproductive patients would 
remain in treatment. Either the theory which 
led to this prediction was in error or else the 
ratings of passivity did not really reflect the 
therapist’s behavior in therapy. 


Table 4 


Rorschach Productivity of Patients Breaking Off 
Therapy with Therapists Categorized 
with Respect to Competence 


No. of patients 
breaking off 
with each 
type of 
therapist 


Proportion 
who gave 
25 or more 
Rorschach 
responses 


Kind of 
therapist 


Most competent 
Least competent 20 


oy 
29% 
507 
55% 


Note.—t = 1.94, » = .03. A one-tailed test was used to 
compute the significance level since the direction of the differ- 
ence had been predicted. 
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Because of the exploratory nature of this 
study, only a few of the many possible anal- 
yses were made and only the most obvious 
hypotheses were tested. However, the method 
used appears to be a useful one and may war- 
rant further consideration. It is suggested that 
patients be characterized on a number of 
variables other than mere productivity. For 
instance, social class, intelligence, anxiety, 
hostility, depression, introversion, passive de- 
pendency and its reaction formation, homo- 
sexuality, paranoid tendencies, schizoid tend- 
encies, psychopathy, over- or undercontrol of 
affect, etc. might be important variables on 
which patients could be characterized. Like- 
wise a number of variables of possible im- 
portance in carrying out effective therapy 
with different types of patients might be used 
to characterize the therapists. Hypotheses 
could then be formulated regarding the best 
type of therapist for each type of patient. As 
to the criterion of compatibility, it may be 
more useful in the long run to use improve- 
ment rather than duration of therapy, al- 
though the former criterion presents more dif- 
ficult measurement problems. 


Summary 


A study was carried out at the Detroit VA 
Mental Hygiene Clinic to determine whether 
therapists differ with respect to the type of 
patients tending to remain in treatment or 
break off prematurely, and, if so, what char- 
acteristics of the therapists may be responsible 
for the differences in their patients’ reactions 
to them. 

In this study, patients were characterized in 
terms of their productivity on the Rorschach, 
since this variable had been found previously 
to be related to premature termination of 
therapy. Therapists were characterized in 
terms of professional training, sex, warmth, 
competence at analytically oriented therapy, 
and passivity. A series of statistical analyses 
were then carried out to determine whether 
these therapist traits were related to the Ror- 
schach productivity of patients continuing or 
discontinuing treatment. 

The results were as follows: 

1. Therapists in general differ in regard to 
the type of patients who continue or discon- 
tinue treatment with them. 
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2. Whether the therapist is a psychiatrist, 
clinical psychologist, or psychiatric social 
worker seems unrelated to the type of patients 
who continue or discontinue treatment. 

3. Whether the therapist was male or fe- 
male did make a difference. The female thera- 
pists tended to keep in treatment more of the 
unproductive patients but also tended to lose 
slightly more of the productive patients than 
did the male therapists. 

4. Therapists rated as most warm and 
friendly were able to keep in treatment a 
larger percentage of unproductive patients 
than therapists rated as least warm and 
friendly. 

5. Therapists rated as most competent at 
analytically oriented therapy tended to lose 
fewer productive patients than therapists rated 
as least competent. 

6. Rated passivity of the therapist seemed 
unrelated to the productivity of patients re- 
maining in therapy. 

It was suggested that the methods em- 
ployed in this study could be applied to fur- 
ther studies of patient-therapist compatibility, 
using a greater variety of patient and therapist 
variables and using improvement rather than 
mere continuation in therapy as a criterion of 
compatibility. Such a study would permit a 
more effective basis for the assignment of pa- 
tients to therapists than is now in practice. 
And this would probably decrease the amount 
of premature termination of therapy and thus 
add to the clinic’s efficiency. 


Received October 4, 1957. 
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The Relationship of Stimulus Ambiguity on 
the TAT to the Productivity of Themes’ 


Bernard I. Murstein 
University of Portland 


One of the problems confronting the cli- 
nician is the relationship of the ambiguity of 
TAT cards to the stories elicited. Medium- 
ambiguous pictures have been shown by 
Kenny (3) to educe the most personality- 
revealing stories. The present study sought to 
measure the relationship between ambiguity 
and the productivity of themes. A curvilinear 
relationship was hypothesized. 

The rank order of 20 “male series” TAT 
cards obtained by Bijou and Kenny (1) for 
28 college males, 23 college females, and total 
group of 51, was used to measure the am- 
biguity (number of possible interpretations) 
variable. The productivity variable was meas- 
ured by the rank order of number of themes 
obtained by Eron (2) with the same cards 
for 150 male veterans of whom 50 were col- 
legians. The ranks of both variables were con- 
verted to normalized scores on the assumption 
that both variables were normally distributed. 
Pearson r’s and »’s were obtained as well as 
chi squares and probability values for the dif- 
ferences between the two correlations. 

Results of the correlation between the Eron 
group and the various combinations of the 
Bijou group were: Regression of productivity 
on ambiguity, Bijou males r .07, » .61, (p 
< .10); Bijou females r .03, » .43, (p < .60); 
Bijou total r .09, » .86, (p < .001). Regres- 
sion of ambiguity on productivity, Bijou 
males r .07, » .95, (p < .001); Bijou females 


1An extended report of this study may be ob- 
tained without charge from Bernard I. Murstein, 
Psychology Department, University of Portland, 
Portland 3, Oregon, or for a fee from the American 
Documentation Institute. Order Document No. 5695, 
remitting $1.25 for microfilm or $1.25 for photo- 
copies. 


r 03, » 43, (p< .60); Bijou total r .09, 
» -73, (p < 01). 

The results indicate the significant curvi- 
linear relationship and nonsignificant linear 
relationship between ambiguity and produc- 
tivity. The curvilinear relationship was such 
that a medium degree of ambiguity elicited 
maximum productivity, while both high and 
low ambiguity resulted in less productivity. 
The smaller degree of relationship found when 
using the Bijou female group was probably 
a function of the “male series” of cards and 
the difference in sex from the all-male Eron 
group. 

A perusal of the TAT cards yielded the 
opinion that the moderately ambiguous cards 
were, for the most part, quite clear in so far 
as perception of the stimulus properties was 
concerned. The ambiguity appeared to re- 
side in the question as to what emotions the 
characters were experiencing. Future research 
might well be directed towards investigating 
whether greater “personality revealingness”’ is 
obtained from vague stimulus figures or from 
clear stimulus figures whose feelings or ac- 
tivities are not readily apparent. 

Brief Report. 
Received May 26, 1958. 
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Countertransference Effects in Psychotherapy’ 


Richard L. Cutler 


University of Michigan 


The therapist’s personality has long been 
recognized as one of the most important vari- 
ables in psychotherapy. Psychoanalytic writers 
have stressed repeatedly the importance of a 
personal analysis for the therapist, pointing 
out the deleterious effects upon the treatment 
which his unresolved conflicts may have. In 
spite of this emphasis in the theoretical litera- 
ture and in practice, researchers have had 
little success in their attempts to investigate 
the influence of the therapist’s personality 
upon his conduct of psychotherapy. 

Two reasons for this lack of success are 
immediately apparent. The first, the extreme 
difficulty in the operational definition of psy- 
choanalytic concepts, is common to all re- 
search on the psychoanalytic theory of per- 
sonality. The second is unique to research in 
psychotherapy, particularly to that dealing 
with countertransference. It is the fact that 
most therapists, having training in at least 
the rudiments of personality theory and psy- 
chodiagnosis, do not readily drop their de- 
fenses to allow us to examine their conflicts 
and inner feelings, at least not short of the 
analytic couch. 


Problem 


The present study approaches the problem 
of the manifestation and effects of counter- 
transference through a reinterpretation of the 
concept in terms of Bruner’s theory of per- 
ception. In this way, some of the operational 
problems are overcome. In addition, by means 
of a rating device developed specifically for 
the purpose, it allows the identification of 


1 This study was made possible by the support of 
US.P.HS. Project M-516, “Analyses of Therapeutic 
Interaction,” E. S. Bordin, Principal Investigator. 
The author wishes to thank all staff members whose 
cooperation made it possible. 


conflict areas in the therapist’s personality, 
and the prediction of their effects upon the 
moment-to-moment interaction between pa- 
tient and therapist. 

Countertransference, as examined in this 
research, is defined as the transference reac- 
tions of the therapist to his patient. Fromm- 
Reichmann (8) defines transference in its 
most general sense as “the process of trans- 
ferring to and repeating early patterns of in- 
terpersonal relatedness with present day part- 
ners.” In its special application to the thera- 
peutic relationship, she points out that it 
means transferring onto the therapist, with no 
or minimal basis in the realities of the situa- 
tion, early experiences in interpersonal re- 
latedness. 


Fenichel (4) discusses transference in terms of im- 
pulses arising and being experienced which are in- 
appropriate to the situation in which they arise. He 
emphasizes the tendency of the person experiencing 
transference reactions to attempt to reconcile the ex- 
perience of the inappropriate impulses with the re- 
alities of the situation, and thereby to distort these 
realities to fit the demands of his own needs. He 
agrees with Freud (7) in terming the appearance of 
such distortions in the therapist as countertrans- 
ference. 

Herman Nunberg, speaking as a member of a 
panel on “Problems of Transference and Counter- 
transference,” dealt with the relationship between 
the unconscious needs of a person and his percep- 
tions of reality. He says: “. . . the patient desires to 
find satisfactions for his needs in the person of the 
analyst, and attempts to change him into an object 
which will provide this satisfaction. He is frustrated 
because the analyst is not this object” (9, p. 26). 

From this point of departure, Nunberg sketched 
the following picture: We carry within us images of 
the need-satisfying objects of old and try to find 
them in the outside world. We seek something in re- 
ality that is exactly like them and desire to bring 
about an identity between the perception of present 
objects and the images within us. The patient may 
unconsciously try to change reality so as to con- 
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form with the inner picture, or may seize upon some 
detail in the present reality which is similar to the 
desired object, and, aided by such detail, project his 
inner images upon the current reality. This process 
Nunberg sees as transference. 

We find a parallel formulation by Bruner 
(2) in the field of perception. Postulating a 
three-step process in perception, which con- 
sists of (a) hypothesis (set, Aufgabe, deter- 
mining tendency); (6) input of information 
from the environment; and (c) confirmation 
or infirmation of the hypothesis, Bruner con- 
siders in detail the influence of need upon 
perception. Hypothesis strength, a central 
concept in Bruner’s theory, is in part de- 
pendent upon the consequences which the 
particular hypothesis has in aiding the organ- 
ism in the fulfillment of needs. The more 
basic the confirmation is to the carrying out 
of need-satisfying activity, the greater will be 
its strength. Bruner states three theorems 
which are contingent upon this concept of 
hypothesis strength: 

1. The stronger an hypothesis, the greater its like- 
lihood of arousal in a given situation. 

2. The greater the strength of the hypothesis, the 


less the amount of appropriate information neces- 
sary to confirm it. 


3. The greater the strength of an hypothesis, the 
more the amount of inappropriate or contradictory 
information necessary to infirm it (2, p. 126). 

Thus it can be seen that strong need-satis- 
fying hypotheses will tend to be confirmed on 
the basis of minimal appropriate information 
from the environment, a situation which is 
directly analogous to what Nunberg terms 
“seizing upon some detail in the environment 
in order to project inner wishes upon the cur- 
rent reality.” We may conclude that the trans- 
ference and countertransference phenomena 
are therefore special cases of perception be- 
ing influenced by need. 

The above formulation permits us to ap- 
proach the problem of countertransference 
from the frame of reference of the influence 
of the therapist’s needs upon his perception 
of his own and the patient’s behavior in psy- 
chotherapy. If the formulation is valid, we 
should find certain systematic relationships 
between the needs and conflicts of the thera- 
pist and his ability to report objectively the 
behavior of himself and the patient in psy- 
chotherapy. When no conflict is present, and 
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no defensive countertransference reactions are 
necessary, he should perceive and report this 
behavior with relative objectivity. On the 
other hand, when the stimulus situation is 
such that material impinges upon the thera- 
pist’s needs and conflicts, we should find sys- 
tematic tendencies to omit, distort, or over- 
emphasize certain aspects of the behavior. 
This discussion leads to the first major hy- 
pothesis of the study: In reporting his be- 
havior and that of the patient, the therapist 
will over- or underemphasize material re- 
lated to his own needs and conflicts as com- 
pared to material which is related to rela- 
tively conflict-free areas in his personality. 
If such effects do indeed result from the 
influence of the therapist’s needs, it follows 
that there will be certain consequences for 
the process of therapy itself. One of these 
may be a rather gross misjudgment of what 
the patient’s characteristic interpersonal rela- 
tionships are like. In practice, however, steps 
are taken to prevent such gross errors by the 
therapist. By means of supervisory sessions 
and the utilization of projective tests, it is 
possible to check the therapist’s judgments 
against less involved and more objective cri- 
teria. However, in the moment-to-moment in- 
teraction between therapist and patient, there 
is little immediate opportunity to control the 
distortions of the therapist. It is here that we 
may expect to find the clearest effects of the 
influence of the therapist’s needs and conse- 
quent distortions upon the process of therapy. 
Stekel (10) says, “. . . scotomata in the 
therapist due to unresolved infantile conflicts 
prevent him from dealing effectively with 
similar material when it is presented by his 
patient.” This formulation is based upon ex- 
tensive clinical experience and a profound 
knowledge of psychoanalytic theory, but it 
may be derived independently from the “need- 
influences-perception” frame of reference. Sup- 
pose that the patient exhibits behavior which 
touches directly or symbolically upon conflict 
areas in the therapist. This material wili be 
threatening to the therapist, and immediate 
defensive operations will need to be under- 
taken to control the threat. This defensive 
operation may take the form of a perceptual 
distortion by the therapist of what the pa- 
tient is doing. In addition, a portion of his 
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attention and energies must necessarily be 
diverted away from the task of psychotherapy 
to the equilibrium-restoring process of de- 
fense. This will then become evident in a less 
adequate handling by the therapist of such 
conflict-relevant material. The above discus- 
sion leads to the second major hypothesis of 
the study: When behavior exhibited by the 
patient is similar to behavior which has been 
identified as conflictual for the therapist, the 
therapist’s responses to this behavior will be 
judged to be significantly less adequate for 
therapeutic purposes than his responses to 
material which is relatively nonconflictual for 
him. 
Procedure 


Therapist conflict areas were identified by 
a method which consisted essentially of com- 
paring therapist’s self-ratings on certain per- 
sonality traits with ratings made by judges 
who were attempting to describe the same 
therapist with regard to the same personality 
traits. Similar self-vs.-judge ratings have been 
used previously by other investigators, and 
the reliability of such ratings was sufficiently 
high to encourage their use in the present 
study (1, 5). The personality traits on which 
the therapists and judges were asked to make 
ratings were derived from a device called 
“The Circle” (6) which had been developed 
specifically for the purpose of categorizing in- 
terpersonal relationship materials of the kind 
found in therapy recordings, TAT stories, etc. 
The use of conflict and nonconflict areas based 
directly upon this coding system permitted a 
direct relating of the therapist need areas 
to the material collected in his therapy in- 
terviews and reports. An attempt was made 
to select words which would be descriptive 
of meaningful personality traits, and which 
would also adhere closely in meaning to the 
various points on the Circle. The sixteen ad- 
jectives finally selected are listed below: (Let- 
ters denote code category on the Circle.) 


Submissive 
Respectful 
Dependent 
Agreeable 
. Affiliative 
. Supportive 
Generous 
Advising 


A. Dominating 4a 
B. Boastful 5. 
C. Rejecting K. 
D. Punitive L. 
E. Critical M 
F. Complaining 
G. Suspicious 
H. Apologetic 
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Each subject therapist was asked to place 
on a 19-point scale, from “most character- 
istic” to “least characteristic,” each of the 16 
traits as he felt it applied to himself in his 
day-to-day dealings with people. Nine or more 
judges, each of whom was personally well ac- 
quainted with the therapist he rated, carried 
out the same task, ie., he rated each of the 
16 traits as he felt it applied to the subject 
on the basis of his day-to-day personal and 
professional contact with the subject. 

Conflict areas were identified as those traits 
which showed a significant disagreement * be- 
tween the self-rating of the therapist and the 
judges’ ratings. The assumption upon which 
this procedure rests is as follows: judges’ rat- 
ings of a therapist are based upon how the 
therapist actually behaves in relation to each 
judge. If the therapist overrates or underrates 
himself on a particular trait, it is because he 
is unable to see himself as he really is. The 
reason for this lack of objectivity is conflict 
which the therapist has over the expression 
or nonexpression of the particular trait. 

Using this technique, two types of conflict 
could be distinguished. The first, consisting 
of those traits on which the therapist over- 
rated himself as compared to the judges’ rat- 
ings of him, will hereafter be denoted as 
“plus-conflict.” The second type, in which the 
therapist underrated himself, will be called 
“minus-conflict.” Traits were judged to be 
relatively conflict-free for a subject when the 
judges’ ratings clustered symmetrically about 
the therapist’s self-rating. It is obvious that 
the total of the traits identified as conflict 


2 The statistic used to establish significance of dis- 
agreement between subjects’ self-ratings and judges’ 
ratings was the “sign test.” This is a nonparametric 
statistic which does not require the assumption of 
underlying normality of the population of judges’ 
ratings on a particular trait. It merely requires that 
each judge’s rating can be categorized as “higher” or 
“lower” than the subject’s rating on any particular 
trait. Significance tables are available for this sta- 
tistic and reveal that with nine judges, an 8-1 split 
is significant at the .05 level, and a 9-0 split is sig- 
nificant at the .01 level (3). In cases where the sub- 
ject’s rating on a trait tends toward the extreme of 
the rating scale, the sign test becomes increasingly 
less applicable, so that trait judgment discrepancies 
which were the result of the subject rating himself at 
one of the extremes of the scale were not used as 
conflict areas. 
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areas plus those identified as conflict-free or 
neutral will not account for all of the 16 be- 
havior traits in each subject. In those cases 
where neither an identification of “conflict” 
nor “neutral” could be made, it was felt that 
the information provided by the therapist 
evaluation device was sufficiently equivocal 
not to permit any predictions to be made 
about these traits. 

In all, ten subjects were pretested in order 
to identify conflict areas. There was no sys- 
tematic tendency for any single trait or group 
of traits to be identified as “conflictual” across 
the subjects, nor were any of the traits sys- 
tematically overrated or underrated across all 
subjects. 

Two therapists were selected for the major 
portion of the study. The major factor in the 
selection was the requirement that they have 
low similarity insofar as identified neutral and 
conflict areas, although considerations of time, 
willingness to cooperate, and availability of 
patients were also involved. 

Therapist 1, who had had three years of 
graduate training in clinical psychology, more 
than 300 hours’ experience as a therapist, and 
had completed a personal psychoanalysis, tape 
recorded three consecutive therapy interviews 
with each of three of his patients in a college 
student counseling center. Therapist 2 was in 
his second year of graduate training in clinical 
psychology, had had less than 50 hours’ ex- 
perience as a therapist, and had had no per- 
sonal psychotherapy. He recorded four con- 
secutive sessions with each of two patients, 
who were being seen in a Veteran’s Adminis- 
tration facility. 

In addition to tape recording the inter- 
views, each therapist dictated a detailed ac- 
count of what he thought had transpired dur- 
ing the session immediately following each 
therapy hour. These reports were not less 
than 500 words long, and included his im- 
pressions of all of his own and the patient’s 
behavior during the hour. The tape-recorded 
interviews and dictated reports were tran- 
scribed verbatim according to a standard set 
of transcription rules, which permitted identi- 
fication of pauses, changes in rate of speech 
and vocal expressions of emotion, as well as 
the actual verbal content. 

The interviews were coded on the Circle, 
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and by summing across all the interviews of 
a given therapist, the total number of actual 
manifestations by the therapist and by the 
patient of each of the 16 interpersonal be- 
havior categories of the Circle was ascertained. 
This summation was assumed to be the obd- 
jective picture of the behavior of the therapist 
and his patients during the series of inter- 
views. 

A similar coding was carried out on the 
therapist reports. By summing the number of 
occurrences of each of the 16 behaviors across 
all the reports of a given therapist, a picture 
was obtained of each therapist’s perception of 
his own and his patients’ behavior. 

The coding reliability of the categorization 
scheme for interview and report material was 
established by spot-checking interviews which 
were randomly selected from the total group. 
One judge coded the entire set of interviews, 
another coded the entire set of reports, and 
a third coded three randomly selected inter- 
views and three randomly selected reports in 
order to establish interjudge reliability. Since 
the usual reliability statistics are not appli- 
cable to codings of this kind, reliability was 
estimated simply by counting the number of 
exact agreements in coding a given statement 
or response, as well as partial agreements and 
disagreements.*® For the three interviews, exact 
agreement ranged from 62% to 71%, partial 
agreement from 24% to 33%, and disagree- 
ment from 5% to 10%. The reliability esti- 
mates were somewhat lower for the report ma- 
terial, but this was partially due to the fact 
that the reports were not coded sentence by 
sentence, and there was no requirement that 
each sentence be coded. It was felt that un- 
der these circumstances, it was not meaning- 
ful to express agreement in terms of percent- 
ages, and product moment correlations were 
run between the total numbers of ratings of 
each trait by the two judges. The results oi 


8 Since a single response by therapist or patient 
might contain one, two, or more Circle category 
codes, partial agreement was possible. The rule of 
thumb established for determining complete agree- 
ment, partial agreement, and disagreement was as 
follows: only when all codes for a given response 
corresponded exactly was complete agreement indi- 
cated; when half or more corresponded, it was con- 
sidered a partial agreement; and when less than half 
corresponded, it was considered a disagreement. 
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three such tests yielded r’s of .75, .81, and 
.84, and were interpreted as indicating that 
the reports could be reliably coded by inde- 
pendent judges. In view of the fact that only 
the total number of appearances of each trait 
in the reports entered into the testing of the 
hypotheses, it was felt that demonstrating 
point-to-point agreement among the judges 
was not necessary so long as the total num- 
ber of codings for each trait agreed to the 
extent indicated above. 

Specific predictions were made regarding 
the relative accuracy of the therapist’s re- 
ports of behavior which had been identified 
as conflictual as compared to the reports of 
behavior identified as neutral. For the thera- 
pist’s reports of his own behavior, it was pre- 
dicted that he would systematically overreport 
those occurrences of behavior in himself which 
had been identified as plus-conflict, and un- 
derreport those which had been identified as 
minus-conflict. 

Concerning the therapist’s perception of the 
behavior of his patients, it was predicted that 
his reports, compared with the actual number 
of occurrences of a given type of behavior, 
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would differ significantly from his reports re- 
garding neutral material. 

In order to test the hypothesis concerning 
the adequacy of the therapists’ handling of 
conflict vs. neutral material, each therapist 
response was coded by an independent judge 
as being either “task-oriented” or “ego-ori- 
ented.” By definition, task-oriented behavior 
was any behavior on the part of the therapist 
which tends to produce or facilitate in the pa- 
tient a flow of therapeutically relevant con- 
versation, while ego-orientation involves thera- 
pist behavior which departs from this task for 
the apparent purpose of expressing the thera- 
pist’s own needs. Thus, a consistent tendency 
on the part of the therapist to use ego-ori- 
ented mechanisms will result in a lessened 
therapeutic efficiency and a consequent slow- 
ing down of the patient’s progress. 

We may consider ego-oriented behavior as 
a form of defensive activity on the part of the 
therapist. So long as the patient’s productions 
do not impinge upon conflict areas, there is 
little need for the therapist to indulge in such 
defensive activity, and he will adhere to the 
task of psychotherapy. However, when the 
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patient presents behavior which does touch 
the therapist’s conflicts, the therapist might 
be expected to resort to ego-oriented behavior 
for the purpose of defending himself against 
the anxiety which the patient’s productions 
arouse. This discussion leads to the following 
prediction: there will be a significant asso- 
ciation between the appearance of ego-ori- 
ented behavior in the therapist, and the ap- 
pearance in the patient’s previous statement 
of material which has conflict-relevance for 
the therapist. 

The coding reliability of the task-ego cate- 
gorization of therapist responses was estab- 
lished by having an independent judge code 
three randomly selected interviews response- 
by-response for either task or ego-orientation. 
This independent judge agreed in 78%, 83%, 
and 85% of the responses with the judge who 
coded the entire set of interviews. These fig- 
ures far exceed chance expectation, and are 
considered to be sufficient evidence for the reli- 
ability of the task-ego categorization scheme. 


Results 


The results of the study bearing on the pre- 
diction of the accuracy of the therapists’ re- 


porting of what transpired in the therapy ses- 
sions are presented in Tables 1 and 2. 

An inspection of Tables 1 and 2 indicates 
the general substantiation of the hypotheses 
concerned with the accuracy of the therapist’s 
reporting where conflict and neutral material 
was compared. Of a total of 40 predictions 
made, 28 were completely supported, 2 others 
showed trends in their favor, 4 could not be 
tested because of an insufficient amount of 
material, and 6 were completely without veri- 
fication. In addition, a total of 10 significant 
disparities were found for which we had been 
unable to make predictions.* The failure of 
the therapist evaluation device to predict in 
every case where distortions would occur is 
admittedly a weakness of the instrument. 
However, the improvement in prediction 
which it does permit is extremely encouraging. 

The results of the second major portion of 
the study are presented in Tables 3 and 4. 

*In every case, significance was tested by means 
of the chi-square test of association or Fisher’s exact 
test. The number of occurrences and reports of those 
traits which had been identified as neutral was used 
as a standard against which to compare the fre- 


quency of occurrence and reported occurrence of 
each of the other traits. 
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The specific prediction that therapist re- 
sponses following patient responses contain- 
ing interpersonal interactions, which had been 
previously judged to be conflictual for the 
therapist, would tend to be ego- rather than 
task-oriented, finds strong support in the data. 


Discussion 


At the very least, we may conclude that the 
general formulations concerning the manifes- 
tations and effects of countertransference are 
substantiated. In addition, it is clear that the 
therapist’s reports of his own and his patients’ 
behavior in therapy are not to be trusted as 
accurate. In 25 out of 27 cases of overreport- 
ing, there was distortion not only on the basis 
of the comparisons with the neutral traits, but 
also on an absolute basis. That is, the thera- 
pists reported more of the behavior than actu- 
ally occurred. Varying levels of experience 
in therapy and of increased self-awareness 
brought about by personal analysis do not 
seem to make a difference. We may also con- 
clude, with somewhat less confidence, that the 
therapist evaluation device provides us with a 
means of predicting those areas in which dis- 
tortions will occur, and in the case of the 
therapist’s self-reports, the direction that 
these distortions will take. 

It is unfortunate that these personality 
variables are not more readily translatable 
into terms which can be applied to psycho- 
analytic theory. It is possible that there is a 
relationship between certain tendencies to 
over- or underemphasize a behavior trait in 
one’s self or in others, and the classical de- 
fense mechanisms described in psychoanalytic 
literature. The tendency of Therapist 1 to 
project his own problem areas into his pa- 
tients’ behavior seems quite clear. Tendencies 
to underestimate the frequency of appearance 


Table 3 


Therapist 1 


Conflict Nonconflict Total 


Category 





Task 159 202 361 
Ego 71 52 123 


230 254 


Note.—Chi square 6.29; p < .02; df 1. 
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Table 4 
Therapist 2 








Category Conflict Total 


Task 58 115 173 
Ego 185 76 261 


Nonconflict 





243 191 434 


Note.—Chi square 59.5; p < .001; df 1. 


of certain traits in one’s self may parallel the 
mechanisms of denial and repression, and to 
overestimate them, the mechanisms of intel- 
lectualization or isolation. It must be admitted 
that the data of the present study are insuffi- 
cient to allow a definitive statement about this 
issue. The problem of relating overt behavior 
types to the theory of defense mechanisms is 
one which requires further iavestigation. The 
methodology of this study seems to offer a 
promising means for such an investigation. 
It seems clear that the appearance in the 
patient of behavior which is conflict relevant 
for the therapist prevents the therapist from 
functioning at maximum efficiency. In addi- 
tion, there seems to be a definite relationship 
between the amount of experience and/or 
self-insight which the therapist has, and his 
tendency to show task-oriented, rather than 
ego-oriented behavior. It appears that even 
though both therapists’ perception of their 
patients is disturbed by their own conflicts, 
Therapist 1 is able to make use of his ex- 
perience to behave in a more appropriate and 
effective manner in the actual process of psy- 
chotherapy. Apparently, training and super- 
vision do pay dividends in terms of increas- 
ing the benefits of psychotherapy to the pa- 
tient, if the proportion of task vs. ego-oriented 
statements by the therapist is accepted as a 
rough criterion of therapeutic effectiveness. 
In addition to the relevance which this 
study has for the selection and assignment 
of patients to therapists whose personalities 
would be least limiting in the relationship 
with a given patient, and the importance 
which it has in regard to the confirmation of 
analytic formulations concerning the mani- 
festations, sources, and effects of counter- 
transference, the possibility of further investi- 
gation of psychoanalytic theory through a re- 
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formulation in terms of concepts from general 
psychology should not be overlooked. Such at- 
tempts to reconcile evidence from two fairly 
separate spheres of psychology may permit 
us to move more rapidly toward a truly gen- 
eral theory of behavior. 


Summary 


This study was concerned with the effects 
of countertransference reactions in the thera- 
pist upon his perception of his own and his 
patients’ behavior in psychotherapy, and upon 
his effectiveness in dealing with patient ma- 
terial which impinged upon his own areas 
of conflict. Based upon a reformulation of 
the psychoanalytic concept of transference in 
terms of Bruner’s theory of perception, two 
hypotheses were developed for test. The first 
postulated a systematic relationship between 
the therapist’s conflicts, and his tendency to 
over- or underreport the occurrence of similar 
behavior in himself and his patient in psycho- 
therapy. The second hypothesized that the 
therapist’s therapeutic handling of material 
which was conflict-relevant for him would be 
less adequate than his handling of material 
which was relatively conflict-free. 

Therapist conflict areas were identified by 
means of a specially developed rating scale 
based upon adjectives (traits) derived from 
the “Circle” interpersonal mechanism coding 
scheme developed by Freedman et al. The ap- 
pearance of significant disparities between the 
therapist’s rating of himself and judges’ rat- 
ings of him was assumed to indicate the pres- 
ence of conflict. Two therapists, whose identi- 
fied conflict areas were dissimilar, tape re- 
corded a series of interviews and, immediately 
following each therapy session, dictated a de- 
tailed account of what they perceived to have 
transpired during the interview. These inter- 
views and reports were then coded by means 
of the Circle. The interview codings were as- 
sumed to represent an objective account of 
what had transpired, and were used as a 
standard against which to compare the thera- 
pist’s reports. In addition, each therapist re- 
sponse was coded as being task-oriented or 
ego-oriented, and the latter was assumed to 
represent a less adequate handling of the pa- 
tient than the former. 

Of a total of 40 predictions made regard- 
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ing the therapist’s tendency to distort his 
reports of need-relevant behavior, 28 were 
clearly confirmed, 2 found partial support, 4 
could not be tested because of an insufficient 
number of occurrences of the behavior in 
question, and 6 were not supported. In addi- 
tion, there was a significant association be- 
tween the appearance, in the patient’s state- 
ment, of material which impinged upon the 
conflict areas of the patient and the judged 
inadequacy of the therapist’s immediately 
following response. 

The results were interpreted to offer strong 
support, both for the validity of the Freudian 
formulations concerning the effect of counter- 
transference and for the adequacy of the 
translation of these formulations into Bruner’s 
terms. 

The feasibility of the objective study of the 
therapeutic relationship was pointed out, and 
the potential fruitfulness, for future research, 
of additional translations of Freudian con- 
cepts into the terms of general psychology 
was indicated. 


Received August 29, 1957. 
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Social Desirability and Self-Ratings of Intakes, 
Patients in Treatment, and Controls’ 


H. J. Wahler 


Veterans Administration Hospital, Knoxville, lowa 


A number of recent studies (1, 2, 3, 7, 11) 
have consistently demonstrated a strong rela- 
tion between self-descriptions and rated so- 
cial desirability of personality items. Com- 
parable correlations have been obtained with 
different item content, scaled and nonscaled 
social desirability weights, and various tech- 
niques such as true-false questionnaires, self- 
ratings, and Q sorts. The stability of certain 
social desirability stereotypes and their rela- 
tive independence of such variables as age, 
education, sex, socioeconomic status, and cul- 
tural differences have been shown by Klett 
(8, 9) and others (1). 

All of the findings cited, involving self de- 
scriptions, were based on the responses of stu- 
dent samples. These studies show that a high 
degree of correspondence between averaged 
self descriptions and item desirability is a 
general response characteristic of such groups. 
Rosen (11) obtained results identical to those 
of Edwards (1) with his student groups. How- 
ever, he found that when the self appraisals 
of individuals were correlated with social de- 
sirability, the obtained coefficients ranged 
from negative to positive values, with a pre- 
ponderance on the positive side. These results 
indicate that not all people give self descrip- 
tions which agree positively with item desir- 
ability. 

To account for the relationship which he 


1 The author wishes to extend his appreciation to 
S. J. Williamson and Harold Bechtoldt for their kind 
assistance in obtaining student ratings and to Lynn 
Roberts, Margaret Walsh and Marvin Graber for 
their generous help in rating patients and scheduling 
patients for testing. The author is also very grateful 
to Rowena Rash for her painstaking assistance with 
the major portion of the computations necessary for 
this study. 


found, Edwards suggested two hypotheses. 
One was that people either consciously or un- 
consciously tend to describe their personality 
traits in such a way as to “look good.” They 
may accomplish this by indicating that so- 
cially desirable items are characteristic of 
them and that undesirable items are not, re- 
gardless of the validity of such assertions. 
The second assumption was that personality 
traits judged desirable are in actuality more 
typical of members of a culture than are un- 
desirable traits, and this is reflected in the av- 
eraged self-descriptions of nondeviant groups. 

Considering only the correspondence of sub- 
jects’ (Ss’) traits (which could be defined op- 
erationally in terms of specified personality 
assessment procedures) and self-descriptions 
with social desirability, Edwards’ hypotheses 
imply two general types of respondents: (a) 
Ss whose traits (actually a selected sample) 
are not in agreement with social values but 
whose self-appraisals correspond with item 
desirability and (6) Ss whose traits corre- 
spond with social desirability and whose self- 
descriptions reflect this. Rosen’s findings pro- 
vide a basis for assuming two additional types 
of respondents: (c) Ss whose personality 
traits do not correspond with social values 
and whose self-descriptions reflect this, and 
the logical possibility of (d) Ss whose person- 
ality traits correspond with social values but 
whose self-descriptions are not positively re- 
lated to item desirability * (malingerers?). 

2 While these four suggested classifications are ex- 
pressed in categorical terms to facilitate exposition, 
it is more realistic to assume that all of the factors 
are a matter of degree, ie., personality traits of indi- 
viduals are more or less in agreement with social 


values as are the self-appraisals of groups or indi- 
viduals. 
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Since self-descriptive techniques are widely 
used in clinical practice, it is important to in- 
vestigate the degree of correspondence be- 
tween the self-appraisals of clinical Ss and 
item desirability. Edwards and others have 
shown that if the social desirability weights 
of a set of personality items are known, ap- 
proximately 75% of the variance in self-ap- 
praisals of nonclinical groups can be pre- 
dicted from such weights. If it were found 
that the self-appraisals of clinical groups cor- 
responded as highly with item desirability 
and could not discriminate clinical from non- 
clinical Ss on any other basis, the clinical use- 
fulness of such techniques would be subject 
to even more serious conjecture than is cur- 
rently the case (4, 5, 6). 

With clinical Ss, particularly those who are 
receiving psychotherapy, operationally defined 
procedures may be readily used to obtain ap- 
praisals of certain of their personality traits 
by therapists. The correspondence between 
therapists’ appraisals of patients and social 
desirability and the correspondence between 
clinical Ss’ self-descriptions and item desir- 
ability may provide an objective basis for 
classifying clinical Ss in terms of respondent 
characteristics such as those suggested above. 

The clinical literature contains numerous 
descriptive accounts of various traits judged 
to be characteristic of patients. Many of these 
traits, when sufficiently pronounced, are con- 
sidered socially undesirable, i.e., hostility, de- 
pendency, immaturity, suspiciousness, self- 
preoccupation, poor interpersonal relation- 
ships, etc. If such traits were judged to be 
typical of clinical Ss, it might be anticipated 
that therapists’ appraisals would not correlate 
significantly with social desirability. If, how- 
ever, clinical Ss tend to deny undesirable 
traits which their therapists attribute to them 
and give self-descriptions corresponding pri- 
marily to social desirability stereotypes, this 
could readily be interpreted as defensiveness 
and/or lack of insight. Under these circum- 
stances, clinical groups could be classified as 
Ss whose traits are not in agreement with so- 
cial values but who give self-appraisals corre- 
sponding with item desirability. 

An investigation of these factors in con- 
junction with clinical Ss cannot ignore the 
possibility that psychotherapy may affect the 


H. J. Wahler 


self-descriptive characteristics of such people. 
It is widely held by many clinicians that, 
among other effects, psychotherapy tends to 
reduce defensiveness and/or increase insight 
or realistic self-awareness. If therapy has this 
effect, the self-descriptions of people receiving 
therapy (patients) should be less strongly re- 
lated to item desirability (and more strongly 
related to therapists’ estimates of their traits) 
than would be the case with people seeking 
such assistance for the first time (intakes). 
This, of course, would not hold if the self- 
descriptions of intakes were in no way indica- 
tive of defensiveness. Considering the repeated 
emphasis in the literature on the observed de- 
fensiveness of people undertaking therapy, 
particularly at the outset, it seems probable 
that their self-descriptions may also reflect 
this tendency. 

While this investigation is primarily con- 
cerned with the relationships between social 
desirability and self-descriptions of clinical 
Ss and therapists’ appraisals, there are other 
characteristics of the self-descriptions of clini- 
cal Ss that may be studied with the obtained 
material. One is the level at which they as- 
cribe various traits to themselves relative to 
nonclinical groups. Such an analysis is impor- 
tant since it is possible to obtain the same 
degree of correlation between self-descriptions 
and item desirability for two different groups 
and still find that one group scored at a sig- 
nificantly higher level than the other. An- 
other property concerns the correspondence 
between self-descriptions obtained from vari- 
ous groups and what may be termed norma- 
tive estimates of the traits of a criterion 
group. In this case, a check on the “validity” 
of the self-descriptions of clinical groups can 
be made by determining the extent to which 
they correspond with therapists’ appraisals of 
patients. If intakes and patients are describ- 
ing themselves with any validity, their self- 
appraisals would be expected to correlate sig- 
nificantly with therapists’ appraisals. On the 
other hand, the self-descriptions of nonclini- 
cal Ss would not be expected to correspond as 
highly with the judged traits of clinical Ss. 

One purpose of this study is to compare the 
degree of correspondence between self-ratings 
and independently obtained social desirability 
weights for a set of personality items found 





Social Desirability and Self-Ratings 


with intakes, patients, and student and non- 
student control groups. A second purpose is 
to compare the coefficients obtained by corre- 
lating averaged therapists’ ratings of a group 
of patients with the self-ratings obtained from 
the above groups. The above groups will also 
be compared with regard to the relative levels 
at which they ascribe clinically meaningful 
traits to themselves which are classified as un- 
desirable, slightly undesirable, and desirable. 

Klett’s (9) findings with groups of Aos- 
pitalized psychotics suggest that there may 
be some systematic differences in judged so- 
cial desirability between clinical and nonclini- 
cal groups. Since differences in conceptions of 
social values could affect self-descriptions, the 
agreement among social desirability ratings 
obtained from students and intake and pa- 
tient samples from an outpatient clinic will 
be determined. 


Procedure 
Subjects 


Two different groups of male clinical Ss 
were used. The patient group consisted of 42 
veterans receiving psychotherapy at an out- 
patient mental hygiene clinic. Only patients 


who were not judged to be grossly disturbed 
or psychotic were asked to participate. These 
people had received an average of 84.83 one- 
hour individual interviews; the number of in- 
terviews ranged from 12 to 392. Patients gave 
social desirability and self-ratings in counter- 
balanced order. The intake group was com- 
posed of 25 veterans in the process of apply- 
ing for treatment at the same clinic. All of 
these Ss gave self-ratings; 6 were randomly 
selected to give social desirability ratings. 
Four different student groups gave social 
desirability ratings. They were comprised of 
62 male and 29 female students in intro- 
ductory psychology classes at the State Uni- 
versity of Iowa (SUI) and 6 male and 27 
female students attending a night course in 
abnormal psychology at Drake University. 
Self-ratings were obtained from a group of 27 
SUI male students and 30 male veterans re- 
ceiving outpatient medical treatment for non- 
psychiatric conditions at a Veterans Adminis- 
tration outpatient center. The veteran group 
was carefully screened to exclude Ss with rec- 
ords of neurological or psychiatric diagnoses 
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or complaints. None of the Ss were older than 
55 or younger than 18. 


Self-rating Inventory 


The inventory used to obtain self-ratings 
and social desirability ratings consisted of 44 
items pertaining to characteristics commonly 
regarded as important clinical variables such 
as anxiety, hostility, sexual difficulties, de- 
pendency, suspiciousness, poor interpersonal 
relations, etc. All self-ratings were obtained 
with a 9-point scale. The extremes were an- 
chored “not at all like me” and “beyond ques- 
tion very much like me.’ Social desirability 
ratings were also obtained with a 9-point 
scale. Extremes were anchored “definitely very 
highly socially acceptable” and “definitely 
very highly socially unacceptable.” All Ss 
were oriented to estimate how desirable the 
various traits would be in general to other 
people: Rosen’s (11) “perceived social desir- 
ability.” 


Therapists’ Ratings of Patients in Treatment 


The 42 patients in treatment were being 
seen by four therapists during the time data 
was collected; 14, 11, 9, and 8 patients, re- 
spectively, were in treatment with different 
therapists. Therapists rated all patients who 
were serving as Ss on the items of the self- 
rating inventory. The same nine-point scale 
was used except that the pronoun “him” was 
substituted for “me” in all anchoring phrases. 
Therapists also rated the items for social de- 
sirability. 


Results 


Social desirability ratings of the student 
groups were pooled since all intercorrelations 
were high (7 = .922) and there were no sig- 
nificant group differences. 

The correlational matrix was computed for 
the averaged social desirability ratings of the 
combined student group (NV = 124), intakes, 
patients, and therapists; the average r was 
.920. A test of over-all homogeneity was non- 
significant; * none of the group means differed 
significantly. These analyses indicate that the 
social desirability ratings obtained from stu- 

8 The results of any statistical test will be termed 


significant if the corresponding p value is .05 or less 
and nonsignificant if p is greater than .05. 
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Table 1 


Correlations of Self-Ratings with Social Desirability Ratings for Controls, Intakes, 


Patients, and Therapists’ Ratings of Patients 








Self-ratings 





SUI 
Social desirability Males 


ratings 





Students (V = 124) 
Patients (VN = 42) 
Intakes (V = 6) 
Therapists (NV = 4) 





* Coefficients based on averaged ratings of 44 items;r = 


dents, intakes, patients, and therapists are 
highly comparable. 

The averaged self-ratings obtained from 
outpatient veterans, SUI students, treatment 
and intake patients, and therapists’ ratings 
of patients were correlated with mean social 
desirability ratings given by four different 
groups. These coefficients are presented in 
Table 1. None of the correlations in any given 
column differ significantly. Therefore, only r’s 
in the first row of Table 1 are compared. 

Correlations between judged social desir- 
ability and self-ratings obtained from the stu- 
dent and nonstudent groups are nearly identi- 
cal. The self-ratings of these groups have 
about 57% common variance with item de- 
sirability. This value may be contrasted with 
44% and 14% common variance with item so- 
cial desirability in the self-ratings of intakes 
and patients, respectively. Therapists’ ratings 


Table 2 


Intercorrelations Among Mean Self-Ratings of Controls, 
Intakes, Patients, and Therapists’ 
Ratings of Patients 


Thera- 

pists’ 

SUI 
Males 


ratings of 


Intakes Patients patients 


Outpatient 
veterans 
SUI Males 
Intakes 
Patients 


710 
687 


425 
514 
811 





* Coefficients based on averaged ratings of 44 items;r = .385 
significant at .01 level; r = .298 significant at .05 level. 


Outpatient 
veterans 


.385 significant at .01 level; r = .298 significant at .05 level. 


Therapists’ 
ratings of 
patients 
(N = 42) 


Patients 
(N = 42) 


Intakes 


27) (N = 25) 





.661 373 .008 
.620 337 — .016 
526 .242 — .093 
521 211 — .103 





of patients were not significantly related to 
social desirability. 

While the correlation between self-descrip- 
tions and item desirability was slightly lower 
for intakes than for controls, the difference is 
short of significance. The correlation between 
self-descriptions and item desirability for pa- 
tients is considerably less than for intakes 
and the two control groups. 

Table 2 contains the matrix of intercorrela- 
tions among the averaged self-ratings of out- 
patient veterans, SUI students, patients, in- 
takes, and therapists’ ratings of patients. The 
test of homogeneity of r’s was highly signifi- 
cant. Self-ratings of patients and intakes cor- 
relate significantly with therapists’ ratings of 
patients; the self-ratings of both control 
groups are not significantly related to these 
values. Self-ratings of patients correlate sig- 
nificantly higher with therapists’ ratings than 
do those of intake* or control groups. The 

* Therapists rated the 42 patients in treatment. It 
is assumed that idiosyncratic differences among indi- 
vidual patients and therapists were averaged out and 
that these pooled values are generally representative 
of traits manifested by patients seen at this clinic. 
However, it may be questioned whether intakes’ self- 
ratings might not have corresponded more highly 
with therapists’ ratings if they had been the indi- 
viduals rated. To investigate this possibility, the av- 
eraged self-ratings obtained from 25 different intakes 
on 36 of the 44 items were correlated with ratings 
done by social workers who saw them for purposes 
of obtaining a case history. Social workers’ ratings 
of intakes correlated .46 with their mean self-ratings. 
Although not an entirely satisfactory comparison 
since therapists had more opportunity for observa- 
tion, this adds support to the finding that intakes’ 
self-ratings do not correspond as highly with ob- 
servers’ ratings as do those of patients. 
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self-ratings of intakes correlate significantly 
higher with therapists’ ratings than do those 
of outpatient veterans but not SUI students. 

It may also be noted that self-ratings of the 
intake and patient groups correlate .811 with 
each other. Self-ratings of the two control 
groups correlate at about the same level (r 
= .£817). The self-ratings of intakes or pa- 
tients correspond to a lesser degree with the 
self-ratings of either control group; these r’s 
range from .425 to .710. 

The average levels at which the various 
groups rated themselves were compared. To 
make these comparisons, the 44 items were 
categorized as undesirable, slightly undesir- 
able, and desirable.* The mean level of the 
ratings on items in these three categories was 
obtained for each S and for therapists’ ratings 
of patients. Analysis of variance (10) based 
on these scores yielded highly significant ef- 
fects for groups,® social desirability, and inter- 
action. The mean ratings of the four groups on 
items in the three social desirability categories 
are presented in Table 3. 

Tests of the significance of differences be- 
tween mean self-ratings of the different groups 
within each category were made. The results 
show that patients rate items classified as un- 
desirable and slightly undesirable at a sig- 
nificantly higher level than intakes. It was also 
found that both clinical groups tend to rate 
undesirable and slightly undesirable items at 
significantly higher levels than controls. In- 
takes and patients also rate desirable items at 
a higher level than controls; however, this dif- 
ference is not significant. The best discrimina- 
tion between controls and intakes or patients 
is obtained with ratings in the slightly unde- 


5 These classifications were based on the pooled so- 
cial desirability ratings of students (N = 124). Arbi- 
trary cutting points of mean values less than four, 
four to five, and greater than five were used to de- 
fine the categories undesirable, slightly undesirable, 
and desirable. The number of items in each classifi- 
cation is 16, 20, and 8 respectively. 

6 The same analysis was done comparing outpatient 
veteran controls with SUI students. Since this analy- 
sis showed that there were no significant differences 
between these two groups, only scores from one 
group were included in this analysis. The outpatient 
veteran group was selected since it is more com- 
parable to the veteran clinic groups in terms of age, 
education, socioeconomic level, and military experi- 
ence than the students. 


Table 3 


Mean Self-Ratings on Items Classified as Undesirable, 
Slightly Undesirable, and Desirable for Controls, 
Intakes, Patients, and Therapists’ 

Ratings of Patients 





Thera- 
pists’ 
ratings of 
patients 


Out- 
patient 
veterans 


Type of 


item Patients Intakes 





Undesirable 
(16 Items) 
Slightly 
undesirable 
(20 Items) 
Desirable 
(8 Items) 


4.90 3.70 2.91 


6.05 


4.73 








sirable category. Ratings in this category cor- 
rectly identified 78% of intake and control Ss 
and 85% of patients and controls. 


Discussion and Conclusions 


The findings show that the self-ratings of 
intakes correlate with item desirability to 
about the same degree as the self-ratings of 
student and nonstudent controls. The self- 
descriptions of patients show only slight cor- 
respondence with cultural desirability stereo- 
types. 

In the case of intakes, however, and to a 
much greater extent with patients, there is 
evidence that their self-descriptions are to 
some degree “valid.” It was shown that in- 
takes do attribute clinically meaningful char- 
acteristics to themselves to a greater extent 
than controls. Their self-descriptions corre- 
lated significantly with therapists’ ratings 
while the ratings of nonclinical Ss did not: 
their self-ratings correlated significantly more 
highly with those of patients in treatment 
than did ratings of controls; and intakes rated 
themselves at significantly higher levels than 
controls on items classified as undesirable and 
slightly undesirable. To this extent, their. self- 
descriptions appear to contain “valid” vari- 
ance. With patients, the evidence is much 
more strongly in this direction. 

These findings, plus the fact that therapists’ 
ratings of patients’ traits did not correlate 
with social desirability, indicate that patients 
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may be readily classified as Ss whose person- 
ality traits do not correspond with social 
values and whose self-descriptions reflect this. 
It is apparent that the self-descriptions of pa- 
tients cannot be as readily predicted from 
item desirability weights as appears to be the 
case with nonclinical Ss. If certain traits of in- 
takes, as a group, are comparable to those of 
patients, and hence do not correspond with 
social values, intakes would be classified as Ss 
whose traits do not agree with social values 
but whose self-descriptions correspond with 
item desirability. Nevertheless, the self-descrip- 
tions of intakes can be reliably discriminated 
from those of nonclinical Ss, since intakes rate 
undesirable and slightly undesirable traits at 
a significantly higher level than controls, and 
the self-ratings of intakes correlate signifi- 
cantly higher with therapists’ ratings of pa- 
tients. These findings lend support to the as- 
sumption that self-descriptive techniques have 
potential clinical and experimental utility. 
Comparing patients with intakes, the find- 
ings show that the self-descriptions of people 
receiving therapy correspond much less with 
item desirability than is the case with people 
seeking help who have not experienced psycho- 
therapy. It was also shown that patients’ de- 
scriptions correspond more closely with the 
appraisals of therapists than do those of in- 
takes. These differences between patients and 
intakes are in agreement with the assumption 
that psychotherapy has the effect of reducing 
defensiveness and increasing insight or real- 
istic self-awareness. The additional finding 
that the number of therapy interviews is sig- 
nificantly related’ to the level at which pa- 
tients ascribe undesirable characteristics to 
themselves is also in accord with this view. 
Among both intake cases and patients in 
treatment, there were individuals who avoided 
ascribing undesirable or slightly undesirable 
traits to themselves. As mentioned earlier, a 
marked tendency to give self-descriptions 
which correspond primarily to cultural stereo- 
types of desirability, and to deny clinically 
meaningful traits judged undesirable among 


7 While not high, the linear correlation was 39 
(p < 025); eta was .66, but the deviation from 
linearity was not significant. If Ss who had received 
no therapy had been included, these coefficients would 
have been larger. 
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people sufficiently maladjusted to seek or par- 
ticipate in therapeutic assistance, suggests 
considerable defensiveness and/or lack of in- 
sight on their part. At least the probability is 
high that they are failing to provide person- 
ally relevant communication. 

The extent to which such noncommunicative 
tendencies are related to various prognostic 
criteria has not been explicated. Questions 
may be raised, such as, do people seeking help 
with personal problems who are noncom- 
municative at the outset constitute poor thera- 
peutic risks? Are they more likely to terminate 
therapy prematurely or fail to show signs of 
profiting from continued therapy than persons 
who are initially more communicative? The 
study of Rubinstein and Lorr (12) suggests 
that some factor such as noncommunicative- 
ness may be related to premature termination. 
These authors found that “. . . [remainers] 
are more dissatisfied with themselves, feel 
worse, and see themselves as having poorer 
interpersonal and overall adjustment than 
terminators.” 

Since it was shown that patients receiving 
therapy were typically more inclined to ascribe 
undesirable and slightly undesirable traits to 
themselves than were intakes, the question 
may also be raised as to whether prognosis is 
more unfavorable for those patients who fail 
to show an increase in personally relevant 
communication after a certain number of in- 
terviews. 

The evidence obtained in this study, while 
preliminary, suggests that scores based on the 
level at which intakes and patients rate them- 
selves on clinically meaningful items classified 
in terms of social desirability values may 
provide useful indices of their ability or will- 
ingness to communicate personally relevant in- 
formation. These measures may in turn have 
potential utility in evaluating response to 
therapy as well as possible selective and pre- 
dictive properties. 


Summary 


The purpose of this study was to compare 
the relationships between judged social de- 
sirability and self-ratings obtained from pa- 
tients in treatment at an outpatient clinic, 
intakes, and student and nonstudent control 
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groups. Another purpose was to correlate the 
self-descriptions of these various groups with 
therapists’ ratings of patients. 

The findings showed that the self-ratings 
of intakes correlated with social desirability 
weights to about the same degree as the self- 
ratings of controls. The correlation between 
the self-ratings and item desirability obtained 
with patients was significantly lower than the 
correlations obtained with intakes or controls. 

Self-descriptions of both clinical groups cor- 
related significantly with therapists’ ratings of 
patients, while the self-ratings of the two con- 
trol groups did not. 

With items classified as undesirable, slightly 
undesirable, and desirable, it was shown that 
patierts rated themselves significantly higher 
than intakes, and intakes rated themselves sig- 
nificantly higher than controls on undesirable 
and slightly undesirable items. Items in the 
slightly undesirable classification gave the best 
discrimination between clinical groups and 
controls. 

The possibility of using scores based on 
items classified as undesirable and slightly un- 
desirable as indices of personal communica- 
tiveness was discussed. It was suggested that 
such scores deserve further study as predictive 
and evaluative measures with regard to 
therapy. 


Received September 3, 1957. 
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A Technique for Investigating the Relationship 
Between the Behavior Cues of the Examiner 
and the Verbal Behavior of the Patient’ 


Leonard Krasner 
Veterans Administration Hospital, Palo Alto, California 


In a previous paper, the writer suggested 
that techniques of conditioning verbal behav- 
ior be used in a program of systematic in- 
vestigation into the process of psychotherapy 
and offered a rationale for such a procedure 
(2). It was proposed that such a program 
be initiated by studying the relationship be- 
tween behavioral cues indicating “attention” 
on the part of the examiner (independent 
variable) and changes in specified verbal be- 
havior on the part of the patient (dependent 
variable). The purpose of this paper is to 
present an experimental technique that can 
be utilized in such an approach and its re- 
sults with two patients. 

A technique was devised to create an inter- 
personal situation which would have as many 
of the characteristics of the psychotherapeutic 
situation as possible. Further consideration 
was to control variables of experimenter’s be- 
havior and to quantify the subject’s verbal 
behavior. It was hypothesized that system- 
atic changes in the experimenter’s behavior 
would result in specifiable changes in the sub- 
ject’s verbal behavior. 


Method 


Subjects. Two male, white, veterans, 30 
years of age, with high school education, and 
diagnosed as schizophrenic reactions in re- 
mission, were asked to participate in a study 
to determine how people “tell stories.” 


1 From the Veterans Administration Hospital, Palo 
Alto, California. Research was conducted at the Vet- 
erans Administration Hospital, Lexington, Kentucky. 

2 A paper based on this study was presented at the 
1957 American Psychological Association meetings. 


Procedure. The equipment consisted of a 
standard timer, placed on the desk so that 
both S and E could see it, and a tape re- 
corder. Prior to the first session, S was given 
the following instructions: “I want you to 
make up a story with at least four characters 
in it, a mother, a father, a child, and an ani- 
mal. It can be any kind of a story. I want 
you to tell me the story for ten minutes. I'll 
start the timer now, and it will ring when the 
ten minutes have passed. All right, go ahead.” 
Prior to each of the following 24 sessions, 
the same instructions were repeated with one 
change. Instead of “I'll start the timer now, 
and it will ring when ten minutes have 
passed,” E said, “You can either continue 
your story from the last session or start a 
new one.” 

Each § participated in 25 ten-minute ses- 
sions, which were divided into five blocks of 
five sessions each. During the first block of 
five 10-minute sessions (nonreinforcement), 
E avoided looking at the S. E either looked 
down at the blank pad in front of him or 
looked out of the nearby window. During the 
second block of five sessions (reinforcement), 
E responded with a combination of behavior 
cues after each time the S made reference to 
the class of verbal behavior “mother.” E’s 
combined behavior cue during the reinforce- 
ment condition consisted of looking at S, 
nodding his head, smiling, and emitting an 
“mmm-hmm” sound. The class of verbal be- 
havior “mother” was defined as “all nouns 
and pronouns referring to the mother figure 
in the S’s story.” 

In the third and fifth block of five sessions 
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Table 1 


Number of “Mother,” “Father,” and Total “References” (Nouns and Pronouns), and Percentages of 
“Mother” to “Mother and Father” and to Total “‘References” for Successive 
Blocks of Five Ten-Minute Sessions 


Subject and 
10-minute 
sessions Condition 


Subject I 


1-5 nonreinf 
6-10 
11-15 
16-20 
21-25 


reinf. 
nonreinf 
reinf. 
nonreinf. 


Subject II 


1-5 nonreinf. 

6-10 reinf. 
11-15 nonreinf. 18 
16-20 reinf 129 
21-25 nonreinf. 100 


the nonreinforcement condition, as in the first 
block of five sessions, was repeated. The 
fourth block of five sessions repeated the re- 
inforcement condition of the second block of 
five sessions. 

After completion of the 25th session, the 
Ss were interviewed to determine their degree 
of awareness of the conditioning procedure. 
This interview consisted of a dozen questions 
such as, “What do you think was the pur- 
pose of these sessions,” “Did you notice any 
changes in the stories as you went on,” “Do 
you think that my sitting here in any way in- 
fluenced your stories?”’ 


Results 


The total number of “references” made by 
S to each of the major characters in the stories 
—“mother,” “father,” “child,” and “animal” 
—were summarized for each individual ses- 
sion. To be scored as a “reference,” the ver- 
balization had to be a noun or pronoun re- 
ferring to one of the four major characters. 
The ratios of “mother references” to “mother- 
father references” and to total “references” 
were computed for each of the 25 sessions. | 

These two ratios were divided into those 
verbalizations emitted under reinforcement 


Number references 


“Mother” “Father” 


Per cent references 
““Mother’’, 
“Mother” 

and “Mother” / 
“Father” Total 


Total 


201 45 12 
66 21 
52 21 
56 26 
46 16 


384 
51 319 
109 261 
114 519 


153 552 


conditions and those verbalizations emitted 
under nonreinforcement conditions. The 
Mann-Whitney U test (5, pp. 116-126) for 
the significance of differences of “references” 
under the conditions of reinforcement and 
nonreinforcement for these two ratios was 
computed for both subjects. All four of the 
ratios were in the predicted direction, and 
three of the four, both ratios of S;, and the 
“Mother’’/total ratio for Sy, were significant 
beyond the .05 level. 

An examination of these two sets of ratios 
for successive blocks of five sessions indicates 
that the ratios of the reinforced class of be- 
havior increased under reinforcement condi- 
tions and decreased under nonreinforcement, 
increased again during the succeeding rein- 
forcement series, and decreased again during 
the final block of nonreinforcement sessions. 
These systematic variations were also true of 
the absolute number of “references” as well 
as the percentages. Table 1 presents both per- 
centages and numbers of “references” for suc- 
cessive blocks of five 10-minute sessions. 


Discussion 


This technique of verbal conditioning en- 
ables the testing of hypotheses concerning 
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what happens to various classes of verbal be- 
havior under selected experimental conditions. 
The hypothesis in the present study that the 
reinforced class of verbalization varies sys- 
tematically with the application of E’s be- 
havior cues was confirmed with two patients. 
It is of interest to note what happened to the 
other categories of verbal behavior which ap- 
peared in the stories, ie., “father,” “child,” 
“animal.” All three of these categories varied 
sharply from session to session, but in no sys- 
tematic manner. 

In a review of “verbal conditioning” studies 
with their implications for psychotherapy (3), 
the writer divided the techniques of con- 
ditioning verbal behavior into four general 
categories of experimental situations: saying 
plural nouns; completing sentences; “story- 
telling” and interview situations; and test- 
like situations. The technique used here falls 
in the “storytelling” category and is similar 
to the techniques reported in studies by Ball 
(1) and Mock (4). 


Summary 


A storytelling technique was used to study 
the relationship between examiner behavior 
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cues and patient’s verbal behavior. The re- 
sults indicate that changes in a preselected 
class of verbal behavior varied as a function 
of the systematic application of behavior cues 
by the examiner. It is concluded that this 
experimental procedure permits the system- 
atic isolation and study of important variables 
of the interpersonal process basic to psycho- 
therapy. 


Received October 8, 1957. 


References 


. Ball, R. S. Reinforcement conditioning of verbal 
behavior by verbal and non-verbal stimuli in 
a situation resembling a clinical interview 
Unpublished doctoral dissertation, Indiana 
Univer., 1952. 

. Krasner, L. The use of generalized reinforcers in 
psychotherapy research. Psychol. Rep., 1955, 
1, 19-25. 

. Krasner, L. Studies of the conditioning of verbal 
behavior. Psychol. Bull., 1958, 55, 148-170. 

. Mock, J. F. The influence of verbal and behav- 
ioral cues of a listener on the verbal produc- 
tions of the speaker. Unpublished doctoral 
dissertation, Univer. of Kentucky, 1957. 

. Siegel, S. Nonparametric statistics for the behav- 
ioral sciences. New York: McGraw-Hill, 1956. 





Journal of Consulting Psychology 
Vol. 22, No. 5, 1958 


Q—L Variability, MMPI Responses, and College Males 


William D. Altus 


University of California, Santa Barbara College 


One of the fundamental cleavages in intelli- 
gence as it is commonly tested is that obtain- 
ing between the ability to manipulate verbal 
concepts effectively and the ability to reason 
with speed and efficiency in terms of quanti- 
ties. This broad dichotomy of general aptitude 
is recognized by the College Entrance Ex- 
amination Board in its Scholastic Aptitude 
Test (SAT) which is comprised of two tests, 
the Mathematics Aptitude Test (MAT) and 
the Verbal Aptitude Test (VAT). In like 
manner, the American Council on Education 
Psychological Examination in its decades of 
existence had its separate quantitative and 
linguistic scores. The Army General Classifi- 
cation Test of World War II sampled these 
same two abilities, linguistic and quantita- 
tive; a third ability measured by the AGCT 
was a spatial factor. It would appear, there- 
fore, that both theory and practice in the con- 
struction of measures of general aptitude have 
placed the stamp of approval on the func- 
tional independence of these two large sectors 
of one’s intellectual abilities, ie., dealing ef- 
fectively and precisely with quantities on the 
one hand and with the less precise, often af- 
fectively toned verbal constructs on the other. 

It would seem plausible to expect that in- 
dividual variability along such culturally sig- 
nificant continua as quantitative and linguistic 
abilities might be reflected in one’s style of 
living and in his various attitudinal frames. 
F. L. Wells (6), for instance, feels that these 
aspects of intelligence are significantly related 
to personality attributes. His basic contention 
appears to be that a deficiency in quantitative 
ability, say, is a function of a basic attitude: 
An individual’s being repelled by the rigidity 
and precision of a numerical quantity, being 
attracted by the adumbrative and connotative 
nuances of the adjective and the verbal tag. 


The verbalist prefers such a statement as 
“That day we traveled a long and dusty dis- 
tance,” to the more prosaic and exact, “We 
spent ten hours going 14.2 miles that day.” 
To the verbalist the latter statement limits 
and confines, leaves nothing to the imagina- 
tion, is obtrusive with its specificity and its 
fractions. Freud would be a good example of 
the verbalist. He disliked and distrusted sta- 
tistics; he even relied on others to work out 
his traveling schedule on the railroad, since 
the thicket of numbers in the ordinary time- 
table was apparently beyond his ability to 
handle. Jones (2) quotes a statement of 
Freud’s concerning the quantitative-linguistic 
dichotomy which is pertinent and interesting, 
“T have very restricted capacities or talents. 
None at all for the natural sciences; nothing 
for mathematics; nothing for anything quanti- 
tative.” 

Wells appears to feel that what he calls 
verbalism is no more an entity than is a neu- 
rosis, but smacks, instead, of a temperamental 
characteristic rather than an intellectual one. 
In terms of general adjustment, he feels, the 
verbalist and the quantitatively oriented per- 
son should be roughly coequal, though the 
verbalist may be somewhat more in accord 
with the norms of cultural prestige, which re- 
ward precision and nicety of diction in both 
speaking and writing. Wells seems to approve 
completely of a statement of Adler’s to the 
effect that the quantitatively oriented indi- 
vidual is more self-sufficient, while the ver- 
balist tends to seek security in social rela- 
tions. “This,” Wells continues (6, p. 72), 
“ties in with the more social tendency ob- 
served generally in the ‘Verbal Facility’ group, 
as well as the special need for ‘belonging’ in 
this particular man.” 

In 1946, Munroe (3) published some inter- 
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esting findings relative to the certain differ- 
ences she found on the group Rorschach be- 
tween students at Sarah Lawrence who were 
quantitatively oriented (as measured by the 
Q and L fractions of the ACE) in comparison 
with those whose tendencies went in the op- 
posite direction. Her general conclusions re- 
lating to the specific differences found were 
that the high Q women were relatively literal 
in their construction of reality, while the high 
L women used a more subjective approach. 
Pemberton (4), working with male executives, 
was able to show some significant relations 
between L and Q differentials on the ACE 
and certain scores on the Kuder Preference, 
Allport-Vernon Study of Values, Thurstone 
Temperament Schedule, Guilford-Martin In- 
ventory of Factors GAMIN. For example, the 
men whose L scores were a sigma above their 
Q scores had significantly higher scores on 
the Kuder Literary scale, Thurstone’s Reflec- 
tive factor, the Allport-Vernon Social. 

In a previous publication, Altus (1) found 
some significant relations between Q—L differ- 
entials on the ACE and the answers of 200 col- 
lege women to the Minnesota Multiphasic Per- 
sonality Inventory. In general, he found that 
the Q-higher-than-L college woman was prim, 
conventional, immature, anxious, and some- 
what resentful in comparison with the college 
woman whose abilities tended to go in the op- 
posite direction on the ACE. A scale of 43 
items was derived from the MMPI which on 
cross-validation gave a PPM of .25, which is 
just short of acceptance at the .01 level, since 
the NV involved was 100. More recently, Spilka 
and Kimble (5) made use of Altus’ 43 MMPI 
items with a population of 87 women students 
at Washburn University, found an r of .22 
with Q—L differentials on the ACE, thus con- 
firming in a striking fashion the validity of 
the items in question. It appears, therefore, 
from these previously reported studies that 
variation in ability along the quantitative- 
verbalist continuum is reflected in certain per- 
sonality factors, which can be specified in 
crude degree for college women, at least. 


The Present Study 


Since college women had been previously 
investigated, it was decided to confine the 
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present research to college men, using the 
same psychometric variables, i.e., the MMPI 
and the differentials in Q and L on the ACE. 
Two hypotheses were set up: (a) that the 
Q-higher-than-L college male would be more 
masculine (have a lower Mf score on the 
MMPI) than the L-higher-than-Q college 
male; (5) that the L-higher-than-O men 
should appear somewhat more sophisticated 
and mature on certain of their answers. The 
reasoning behind the hypothesis of greater 
masculinity of the high Q males is that men, 
generally, tend to do better on Q tests than 
do women; hence, it would appear that men 
who excel other men in this sex-biassed vari- 
able should show more of those attitudes cul- 
turally associated with masculinity. The basis 
for the second hypothesis is derived from local 
findings on college women: the more verbal 
women were more mature and sophisticated 
in the answers which discriminated them from 
the higher Q women on the MMPI. 

The raw ACE Q and L scores for 200 col- 
lege males were converted into standard units 
with means of 50 and sigmas of 10. Two 
groups of 100 with comparable Q and L scores 
were formed. Each of these groups of 100 was 
broken into quartiles of 25, the basis for the 
separation being the direction of the discrep- 
ancy scores between the Q and L variables. 
All the validating and clinical scales of the 
MMPI were then tested for significant mean 
differences. In addition, the individual items 
of the MMPI were analyzed for relations to 
the Q—L dichotomy. Only those items which 
showed consistent differences in the same di- 
rection for both groups of 100 males were re- 
tained for study. This precaution was neces- 
sary because of the probably low reliability 
of individual T-F items in the MMPI; using 
the entire 200 as a single group with four 
quartiles for analysis would cause the inclu- 
sion, by chance alone, of a sizable, if indeter- 
minate, number of items from the 566 items 
which comprise the MMPI. 


Results 


Two of the MMPI scales showed significant 
mean differences between upper and lower 
quartiles when the entire 200 college men were 
considered as a single group and ranked on 
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Table 1 


The Significance of the Differences Between the Answers of 100 High Q and 100 High L 
College Males to Selected Items of the MMPI 


Marking 
charac- 
teristic of 
Q males 


3p. Level of 
MMPI _confi- 
No. dence MMPI items 


| MMPI 


Marking 
charac 
teristic of 
Q males 


Gp. Level of 
confi- 


No. dence MMPI items 





225 . ‘als I gossip a little at times. 
244 ’ My way of doing things 


is apt to be misunder- | 


stood by others. 

295 ; I liked “Alice in Wonder- 
land” by Lewis Carroll. 

384 é I feel unable to tell any- 
one all about myself. 

504 00 I do not try to cover up 
my poor opinions or pity 
of a person so that he 
won’t know how I feel. 


Iam apt to pass up some- | 


thing I want to do when 


others feel that it isn’t | 


worth doing. 
I like dramatics. 


I know who is responsible | 


for most of my troubles. 
I like to cook. 

I would like to be a jour- 
nalist. 


My relatives are nearly | 


all in sympathy with me. 


Even when I am with | 


people I feel lonely much 
of the time. 


Iam apt to pass up some- | 


thing that I want to do 


because others feel that | 


I am not going about it 
in the right way. 


I am bothered by people | 


outside, on street-cars, in 
stores, etc. watching me. 
I do not always tell the 
truth. 

I am very strongly at- 


tracted by members of | 


my own sex. 





the continuum of Q—L discrepancy on the 
ACE. For both of the separate groups of 100 
men, the quartile containing the men with 
the highest O-higher-than-L discrepancy score 
showed higher mean values on the MMPI Lie 
scale than did the quartile for the men at 
the opposite pole, where the L-higher-than-Q 
males were found. The mean difference was 
significant at the .01 level. This finding would 


78 OS 
82 05 


False 
True 


I like poetry. 

I am easily downed in an 
argument. 

The things that some of 
my family have done 
have frightened me. 
People have often mis- 
understood my intentions 
when I was trying to put 
them right and be help- 
ful. 

I pray several times a 
week. 


325 05 False 


True 


I strongly defend my own 
opinions as a rule. 

I go to church almost 
every week. 

I find it hard to make 
talk when I meet new 
people. 

I can read a long while 
without tiring my eyes. 
Sometimes at elections I 
vote for men about whom 
I knew very little. 

I am easily embarrassed. 
I like to read newspaper 
editorials. 

I sometimes keep on at a 
thing until others lose 
their patience with me. 

I believe there is a God. 
At times I have been so 
entertained by the clev- 
erness of a crook that I 
have hoped he would get 
by with it. 


tend to bear out in a gross way the hypothesis 
that the L-higher-than-Q males tend to be 
somewhat more sophisticated or less naive 
than the Q-higher-than-L college male. 

Also in line with the original hypotheses, 
the Mf scale of the MMPI showed a highly 
significant mean difference between the high 
and low quartiles of the total group of 200 in 
Q—L discrepancy scores. The mean difference 
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is significant at the .001 level of confidence 
(¢ of 4.30). The college male whose abilities 
are relatively higher in dealing with quantities 
does have more masculine attitudes, insofar as 
these are validly measured by the Mf scale of 
the MMPI. None of the other MMPI scales, 
either clinical or validating, showed any mean 
difference that was significant even at the .05 
level, corroborating once more Wells’ original 
contention and Altus’ and Munroe’s findings 
that the Q—L discrepancies on the ACE are 
not related to general adjustment for college 
women. 

Some 22 MMPI items which differentiate 
from the .05 to the .001 level between the Q- 
versus the L-oriented college man are given in 
Table 1; an additional six which differentiate 
at the .10 level are also listed, as well as three 
whose nominal significance is less than .10. 
It will be remembered that two groups of 100, 
each of which was broken into four quartiles 
for purposes of analysis, were used in sifting 
the 566 MMPI items. Only those items which 
inspection indicated to vary rather markedly 
and in the same direction for the two groups 
were picked out to test for levels of confi- 
dence. Before confidence levels were tested, 
all 100 males in both groups who were above 
the average for the total group in Q score 
were compared with the 100 whose abilities 
ran to the verbal; in other words, the entire 
group of 200 males was used in these com- 
parisons—a technique which tends to attenu- 
ate the differences which emerge when only 
extreme groups are employed. 


Six of the items in Table 1, Nos. 69, 78, 126, 140, 
204, and 295, are found on the Mf scale. On five of 
these items the marking shows a correlation between 
masculinity and Q-higher-than-L tendency; on the 
sixth item, “I am very strongly attracted by mem- 
bers of my own sex,” the marking is in the opposite 
direction, that is, the Q-higher-than-L males say Yes 
to this item more frequently (.05 l.c.). The Q-higher- 
than-L female answered the same way (1) and at 
the .001 level of confidence. It is difficult to rational- 
ize this item. It could mean that homosexual atti- 
tudes are more prevalent among high Q groups, male 
and female; it might also mean that they are less 
well socialized, rather timid, and at ease only with 
their own sex. However that may be, the dislike for 
poetry, cooking, journalism, dramatics, and Lewis 
Carroll’s Alice in Wonderland stands out as a typi- 
cal anti-aesthetic attitude common among the more 
“masculine” males. 
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Three items from the Lie scale show up: 45, 225, 
255. The high Q male is more frequently certain he 
always tells the truth, that he doesn’t gossip, and 
that he knows something about all those men he 
votes for at election time. Naivete and lack of in- 
sight seem to be the plausible explanation rather than 
an attempt at dissimulation, since dissimulation would 
be reflected in attempts to “fake good” on the clini- 
cal scales which patently is not the case. It may, of 
course, actually be true that the high Q man doesn’t 
gossip as often as his verbalist brother; just possibly 
he doesn’t converse as frequently and perhaps he 
isn’t as interested in people—this latter observation, 
it should be noted, would fit in with Wells’ and 
Adler’s contentions. 

The attitude of the high Q males toward religious 
matters (95, 258, 488) is somewhat ambiguous: They 
pray less frequently than the more verbal college stu- 
dent, and they go to church less frequently. Although 
the answer is not statistically significant, they more 
frequently are sure there is a god (258): Of the 100 
Q-higher-than-L men, 88 said Yes to this item, while 
only 80 of the L-higher-than-Q men answered in this 
fashion. In Altus’ study (1) of college women, the 
high Q women were definitely more religiose, but it 
must be admitted that the evidence for college men 
is equivocal. In both instances, however, the high Q 
student is more orthodox in belief even if he isn’t 
necessarily in his religious practice. 

The dislike for things verbal stands out in several 
of the high Q answers: 78, 126, 188, 204, 295, 428. 
They don’t like poetry, they don’t like dramatics, 
they “tire” their eyes if they read a lot, they wouldn’t 
like to be a journalist, they didn’t like Carroll’s Alice, 
they don’t like to read editorials in the newspapers. 
This finding strikingly corroborates the data for col- 
lege women (1). The high Q person simply does not 
like to read to the same extent as does the verbalist. 
Which is cause, however, and which is effect is un- 
clear, even though the facts of the matter are patent. 

The high Q person is not so dominant or sure of 
himself in ‘certain matters: 82, 443, 520, 564. He’s 
easily downed in an argument, he passes things up 
when others criticize, he doesn’t defend his opinions 
strongly as a rule. He is therefore less dominant in 
interpersonal relationships; for this reason we should 
expect to find, perhaps, that most salesmen and al- 
most all politicians would be on the verbalist side. 

Constriction and difficulty in interpersonal rela- 
tions is also evident in many other answers: 82, 180, 
237, 321, 366, 384, 404, 443, 504, 520, 564. Other peo- 
ple outtalk him, he finds it difficult to make conver- 
sation with new people, his relatives are not in sym- 
pathy with him, he is easily embarrassed, he feels 
lonely even when with people, he can’t unburden 
himself freely to others when talking of himself, he 
passes up things when others criticize, he doesn’t 
mask his attitude toward others when he should. A 
rather gauche, retiring, somewhat socially inept per- 
son is here adumbrated. A number of these items also 
characterized the high Q woman (1) but not to the 
same degree as is noted here for the college male. 
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Naiveté, dislike of the printed page, diffi- 
culty in social relationships, greater mascu- 
linity (put more exactly, antipathy to the 
aesthetic)—these are the general character- 
istics which the MMPI items show to be true 
to the Q-higher-than-L college male when he 
is compared with his verbalist brother. 


Discussion 


The assumption which underlay the present 
research was that differences in quantitative 
and linguistic abilities among college males, 
as measured on the ACE, should be reflected 
in attitudinal frames as measured by the 
MMPI. It was hypothesized that the Q- 
higher-than-L male should tell more “lies” on 
the MMPI and appear more masculine. Both 
hypotheses were born out at statistically sig- 
nificant levels when the ZL and Mf scales were 
checked for the two groups of men. 

The individual items of the MMPI which 
differentiated high Q and high L groups cor- 
roborated previous research (1, 4, 5) which 
showed an antipathy toward reading and simi- 
lar verbal matters and occupations for the 
quantitatively oriented person. Both male and 
female high Q scorers also appear to be less 
forward, aggressive, and sure of themselves in 
social relationships than the student with more 
verbalist propensities. The data lend some 
plausibility to a hypothesis that social domi- 
nance and leadership may show more than 
a chance linkage to L-higher-than-Q type of 
abilities. It would appear that the politician, 
who lives primarily in a welter of words, 
should be a verbalist far more frequently than 
the 50/50 probability of the flip of a coin 
turning up heads. These tentative hypotheses 
lend a bit of piquancy when one speculates 
about Freud and his marked deficiency in 
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matters quantitative. Had he been more adept 
at manipulating numbers, perhaps he would 
have remained in laboratory research; had 
this obtained, there would have been no pri- 
vate practice of medicine, no neurotics, no 
dream theories, no psychoanalysis. But this 
is pure speculation, interesting perhaps but in 
the view of history completely idle. 

The contention of Wells (6) and Munroe 
(3) that general adjustment is not related to 
variation in quantitative and verbal abilities 
appears to be corroborated by the present 
study and by the one which preceded it (1). 
In summary, the one whose gifts are primarily 
verbal tends to be more sophisticated, self- 
insightful, socially dominant, literary, and less 
orthodox in religious matters; at least that 
is the way college students of both sexes tend 
to evaluate themselves on the MMPI and, to 
the extent that their answers are truthful and 
valid, it should show in actual behavior out- 
side the narrow limits of the testing situation. 


Received June 16, 1958. 
Early Publication. 
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a Personality Rigidity Scale’ ’ 


John M. Rehfisch 


Sonoma State Hospital, Eldridge, California 


This is a report of the correlations between 
a true-false inventory scale for personality 
rigidity (Ri) and numerous other objective 
psychological measures. An earlier paper (5) 
described the theory and derivation of Ri. 
Briefly, the scale was constructed by selecting 
items which differentiated between designated 
samples of rigid and flexible subjects. The 
criterion employed was ratings of rigidity by 
staff assessors from the University of Cali- 
fornia’s Institute of Personality Assessment 
and Research (IPAR). A cross-validating ap- 
praisal of a preliminary version of Ri, in terms 
of staff descriptions of high and low scorers, 
indicated that the scale measures essential at- 
tributes of the conceptualized rigidity dimen- 
sion. High scorers, as compared to lows, were 
described generally as constricted, inhibited, 
anxious, guilt prone, conservative, socially in- 
troverted, and inflexible in their social roles. 
Low scorers, by contrast, tended to be seen 
as adaptable, spontaneous, original, fluent in 


1 This research is supported in part by the United 
States Air Force under Contract No. AF 18 (600)-8, 
monitored by Technical Director, Detachment No. 7 
(Officer Education Research Laboratory), Air Force 
Personnel and Training Research Center, Maxwell 
Air Force Base, Alabama. Permission is granted for 
reproduction, translation, publication, use, and dis- 
posal in whole and in part by or for the United 
States Government. Personal views or opinions ex- 
pressed or implied in this publication are not to be 
construed as necessarily carrying the official sanction 
of the Department of the Air Force or of the Air Re- 
search and Development Command. 

2The author is very much indebted to Harrison 
Gough for his advice and encouragement and for 
his critical reading of a preliminary version of the 
manuscript; and to Donald MacKinnon, Director of 
the Institute of Personality Assessment and Research, 
for many valuable suggestions and for permission to 
use the data and facilities of the Institute. 


thought and speech, curious, clear-thinking, 
assertive, and self-indulgent. 

The present study was designed to further 
explicate Ri by defining its linear relations to 
other psychological scales and tests. 


Procedures and Results 


Correlation coefficients were computed be- 
tween Ri and: (a) the standard MMPI scales, 
(6) three special MMPI scales (1, 4, 7), (c) 
the standard scales of the California Psycho- 
logical Inventory (CPI) (2), (d) the Terman 
Concept Mastery Test (6), (e) the Idea Clas- 
sification Test,* and (f) tests from the Guil- 
ford Creativity battery (3). Two samples of 
AF captains (60 in each sample), separate 
from the assessed samples used in the deriva- 
tion and initial validation of the scale, were 
employed as subjects. Their scores on the 
above listed measures were obtained from re- 
sults of an IPAR paper-and-pencil testing 
program, which had been administered to the 
officers at their home bases. 

Correlations were computed also between 
Ri and the UCPOS Ethnocentrism and Fas- 
cism scales. The single sample of 100 assessed 
AF captains used in obtaining the latter two 
correlations had been included among the 
groups employed in deriving Ri: this sample 
was the only one for which scores on both Ri 
and the UCPOS scales were available. 

All the findings are listed in Tables 1, 2, 3, 
and 4. 

Prominent among the major correlates are 
various indices of social behavior and of 
dominance and leadership ability. High scorers 

8 Acknowledgment is made to Educational Testing 


Service for making this test available prior to its 
publication. 
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Correlates of a Personality Rigidity Scale 


Table 1 
Correlations of Ri with Scales of the MMPI 





17 A7 
BS nae .59** 
10 OS 
32° A3** 
.26* .25 
17 10 
16 16 
.00 16 
18 Ai** 
08 — .04 
21 03 
A** Jeo 
4R** .57** 
44** — .59** 
.66** —.65** 


(1st factor: anxiety) 
Es (ego-strength) 
Lp (leadership) 


* Significant at the .05 level 
** Significant at the .01 level. 


on Ri, as opposed to lows, would appear to be 
socially introverted (Si, Sy), submissive (Do), 
and relatively deficient in personality at- 
tributes associated with social presence (Sp) 
and with leadership ability (Lp). 


Table 2 


Correlations of Ri with Scales of the CPI 
for Two Samples of AF Captains 


r r 


(V=60) (V=60) 


Do (dominance) 

Cs (capacity for status) 
Sy (sociability) 

Sp (social presence) 

Sa (self-acceptance) — .29* 
Wb (sense of well-being) — .38** A9** 
Re (responsibility ) —.16 a 
So (socialization) — .08 10 
Sc (self-control) 06 19 
To (tolerance) —.55** 61** 
Gi (good impression) — .32* 35** 
Cm (communality ) — .03 .27° 
Ac (achievement via conformance) —.52** 50** 
4i (achievement via independence) — .50** .23 
Te (intellectual efficiency) —.64** ga 
Py (psychological-mindedness) — 24 39** 
Fx (flexibility) — .33* 31* 
Fe (femininity i 21 


—.53** 
—.55** 
—.70°* 


— 75% 


. 59** 
sse* 
OP es 
"4% 
_ 


41** 


* Significant at the .05 level 
** Significant at the .01 level 


Table 3 


Correlations of Ri with Tests of Cognitive Functioning 
for Two Samples of AF Captains 


r 


(N=60) 


r 
(V=60) 


Intelligence tests: 
Terman Concept Mastery 33* 
Idea Classification 21 
Creativity tests :* 


Unusual Uses 
Gestalt Transformation 


* Significant at the .05 level. 
** Significant at the .01 level 
* Correlations were computed, in one of the above samples, 
between Ri and other tests from the creativity battery, includ 
ing Plot Titles, Match Problems, Consequences, and Controlled 
Association. These correlations were all approximately zero. 


A second cluster of correlates suggests that 
high Ri persons are likely to be anxious and 
depressed (A, D), self-dissatisfied, and em- 
phatic in the expression of their complaints 
(Sa, K, Wd). 

There are indications also that high scorers 
tend to be less intellectually efficient (Je), 
less interested in and motivated toward scho- 
lastic achievement (Ac, Ai),* less original 
(Unusual Uses), and slightly less intelligent 
than low scorers. 

The correlations of Ri with the Tolerance, 
Fascism, and Ethnocentrism scales accord 
with the findings from various other studies 
which reflect an association between rigidity 
and prejudice. 

The Es scale, which is negatively related to 
Ri, was derived by contrasting the item re- 
sponses of patients who did and patients who 


Table 4 
Correlations of Ri with Scales of the UCPOS 


for a Sample of AF Captains 


F (fascism 
E (ethnocentr 


* Significant at the .05 level 


*The Ac and Ai scales were derived by selecting 
items capable of discriminating between high and 
low achievers at the high school (Ac) 
(Ai) levels. 


and college 
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did not benefit from psychotherapy (1). High 
Es scores are predictive of a favorable thera- 
peutic outcome. The present implication of a 
negative relation between rigidity and ca- 
pacity for improvement in psychotherapy ap- 
pears reasonable; for it is quite probable that 
the modification in patient personality, seem- 
ingly involved in a beneficial therapeutic 
process, is facilitated by psychological flexi- 
bility. 

The negative correlations with the Flexi- 
bility scale were naturally anticipated. That 
the correlations are not larger is probably a 
consequence of essential dissimilarities be- 
tween the two scales due to differences in the 
methods used in their construction. The Fx 
scale was developed in a rational a priori 
manner (2), whereas Ri was derived by em- 
pirical analysis. 

Finally, there are indications (Cs) that 
qualities assessed by Ri are negatively re- 
lated to capacity for high social status. 


Summary and Conclusions 


The test and scale correlates of a scale for 
personality rigidity indicate a tendency for 


high scorers on Ri, as contrasted to lows, to 
be: (a) socially introverted and lacking in 
social presence (defined as poise, spontaneity, 
and self-confidence [2]); (5) submissive and 
low in leadership qualities; (c) anxious and 
self-disparaging; and (d) unoriginal and rela- 
tively deficient in cognitive and motivational 
factors associated with intellectual competence 
and achievement. These findings are generally 
congruent with other validating data (5), in- 
cluding the assessor descriptions of high and 
low scorers, obtained among subjects separate 


John M. Rehfisch 


from those employed in the present study. The 
qualities listed above may therefore be con- 
sidered fairly stable Ri correlates for the kinds 
of subjects used in defining the scale, i.e., 
adult males (university students [5] and AF 
captains) who were, as a group, above the av- 
erage in intelligence and education. 

There were indications also, among the cor- 
relating scales, of a positive association be- 
tween Ri and prejudice. Such a relationship 
had not been specifically indicated by the as- 
sessor descriptions; prejudice may, accord- 
ingly, be a somewhat less salient and/or re- 
liable correlate of the Ri scale than the other 
qualities listed above. 
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Aron Wolfe Siegman 
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Many investigations have been undertaken 
in order to determine some of the variables 
which are associated with ethnocentric atti- 
tudes. Frenkel-Brunswik and her associates 
have reported fairly high correlations between 
ethnocentric attitudes and authoritarian ide- 
ology of the fascistic variety (1). Some au- 
thors have suggested that ethnocentric atti- 
tudes are associated with personality malad- 
justment (6, 7, 9) and lower intelligence (6, 
7, 12). Very little attention, however, was 
paid to the problem whether these obtained 
relationships may not be a function of the 
degree to which the culture sanctions ethno- 
centric attitudes. The major purpose of the 
present study was to investigate the relation- 
ship between authoritarian ideology, person- 
ality maladjustment, and intelligence on the 
one hand, and anti-Negro attitudes on the 
other hand, in a group of college students 
who were born, raised, and at the time of this 
study resided in the South. 


Method 


Subjects. The Ss were 41 University of 
North Carolina undergraduates, all of whom 
were born, raised, and at the time of this 
study resided in the South. 

Measures. All Ss were administered the six 
items from the E scale which pertain to Ne- 
groes (1, p. 142), and a 29-item F scale (1, 
pp. 255-257). The Taylor MAS (13), which 
was found to correlate .92 with a general neu- 


1 This investigation was conducted while the au- 
thor was at the University of North Carolina. The 
study was supported in part by a grant from the 
Behavioral Science Research Fund. 


roticism scale (5), was administered as an 
index of Ss’ adjustment level. All Ss were also 
administered Gough’s Pr scale (7), which 
consists of personality oriented MMPI items 
which discriminated between high and low 
scorers on an anti-Semitism scale. There is 
evidence to suggest that a high Pr scale score 


is associated with personality maladjustment 
(7, 9). 


It has been demonstrated that vocabulary test per- 
formance correlates very highly with performance on 
an intelligence test which samples a wide variety of 
intellectual abilities (14). In the present study, the 
vocabulary test of the Shipley-Hartford (S-H) Re- 
treat Scale (10) was used to determine Ss’ intelli- 
gence level. In order to raise the ceiling of the test, 
a three-minute time limit was imposed. 

In a recent study which investigated the relation- 
ship between Ss’ F scale and Einstellung test scores, 
it was found that it was not so much the content 
of the F scale items, but rather Ss’ tendency to 
“acquiesce” and answer “True,” irrespective of con- 
tent, which was responsible for the positive correla- 
tion between Ss’ F scale and Einstellung test scores 
(8). This finding raises the problem whether the 
same explanation may not be true for the frequently 
noted negative correlation between Ss’ F scale and 
intelligence test scores (1, 3, 6, 11). Consequently, 
all Ss in this study were also administered a reversed 
F scale (2). 

In order to minimize defensiveness, Ss were as- 
signed numbers and requested not to indicate their 
names on the attitude and personality questionnaires. 


Results and Discussion 


Table 1 lists Ss’ mean and standard devia- 
tion scores on the various tests. Table 2 indi- 
cates that no significant correlations were ob- 
tained between Ss’ E scale scores on the one 
hand, and their Taylor MAS, Gough Pr, and 
S-H vocabulary test scores on the other hand. 
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This finding is clearly consistent with the hy- 
pothesis that neither neuroticism nor intelli- 
gence is significant sources of variance in cul- 
turally accepted ethnocentric attitudes. 

Table 2 indicates that a significant positive 
correlation was obtained between Ss’ E scale 
and F scale scores. The correlation, however, 
was significantly lower than the correlations 
which have been obtained by the authors of 
The Authoritarian Personality (1, p. 263) for 
non-Southern groups (average r = .77). This 
finding suggests that the magnitude of the 
correlation between a particular set of ethno- 
centric attitudes and authoritarian ideology, 
or authoritarianism as a personality variable 
(1), decreases with increased cultural ap- 
proval of such ethnocentric attitudes. 

Table 2 indicates that a significant negative 
correlation was obtained between Ss’ F scale 
(but not reversed F scale) and S-H vocabu- 
lary test scores. Significant positive correla- 
tions were obtained between Ss’ F scale scores 
on the one hand, and their Taylor MAS and 
Gough Pr scale scores on the other hand. 
These results are consistent with the hypothe- 
sis that intelligence and neuroticism are sig- 
nificant sources of variance in authoritarian 
ideology. The finding, however, that neither 
neuroticism nor intelligence was a significant 
source of variance in culturally approved eth- 
nocentric attitudes, certainly suggests the pos- 
sibility that the contribution of neuroticism 
and intelligence to authoritarian ideology too 
may decrease with increased cultural approval 
of such ideology. In fact, the correlation be- 
tween Ss’ Taylor MAS and F scale scores in 
the present study was significantly lower than 
the correlation which was obtained by Davids 
(3) in a group of Harvard students (.69) 
whose mean score per item on the F scale 


Table 1 


Ss’ Mean and SD Scores on the Various 
Psychological Tests 








Tests Mean 





F scale 

E scale 

Taylor MAS 

Gough Pr scale 

S-H Vocabulary MA 


99.51 
19.73 
15.15 
10.51 
16.42 
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Table 2 


Correlations Between the E and F Scales and 
Other Psychological Tests 








S-H 
Vocab- Taylor Gough E F 


Tests ulary MAS Pr _ scale scale 





E scale 

F scale 

F scale 
(reversed) 


— .03 16 .06 
—38* 31° os 6 = Al? 
—.14 03 08 








* Significant at the .05 level. 
** Significant at the .01 level. 


(2.97) was lower than that of the present 
group (3.43).* 


Summary 


No significant correlation was obtained, in 
a group of Southern undergraduates, between 
Ss’ performance on an anti-Negro bias (E) 
scale and Ss’ Taylor MAS, Gough Pr and 
Shipley-Hartford vocabulary scores. Further- 
more, the correlation between Ss’ E and F 
scale scores was significantly lower than in 
non-Southern groups. These findings are con- 
sistent with the general hypothesis that the 
negative correlates of ethnocentric attitudes 
tend to decrease as the culture countenances 
these attitudes. 
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U. S. Naval School of Aviation Medicine 


Whereas personality scales are designed to 
indicate behavioral tendencies, it is generally 
difficult to find consistent behavioral criteria 
upon which to validate the scales. The present 
study takes advantage of such behavioral 
criteria extant in the Naval Air Training Pro- 
gram. While it is understood that a student 
may withdraw from flight training at any 
time, there is considerable pressure from peers 
and instructors to “stick it out.” Some stu- 
dents drop for good reason, but in most cases 
the action is no clearer than a vaguely ex- 
pressed “distaste for the situation.” The 
Gordon Personal Profile has a scale termed 
Responsibility which is defined as follows: 
“Individuals who are unable to stick to tasks 
that do not interest them, and in the extreme, 
who tend to be flighty or irresponsible, usually 
make low scores on this scale” (1). The pri- 
mary purpose of this study was to validate 
this scale against the criterion of voluntary 
withdrawal. A matter of secondary interest 
was the extent to which the Emotional Sta- 
bility scale of the Profile was related to flight 
failure. A negative relation would be expected 
since anxiety or tenseness often hampers flight 
performance. 

In the fourteenth week of preflight training, 
1039 cadets completed the Gordon Personal 
Profile. Six months later, 95 had flight failed, 
and 117 had withdrawn voluntarily. The re- 
maining 887 cadets were categorized as “suc- 
cessful.” Table 1 shows the point-biserial cor- 
relation between each scale and each type of 
attrition. The data show consistent zero rela- 


1 The views expressed here are not to be construed 
as necessarily reflecting those of the Navy. 


tionships. The frequency distributions of Total 
Score and the two subscales in question were 
examined in search of a discriminating cutoff 
which might be used in lieu of linear correla- 
tion statistics. From this analysis it was evi- 
dent that the very low scoring students did 
not fail or withdraw at a higher rate. 


Table 1 


Means, Standard Deviations, and Validity Coefficients 
on the Gordon Personal Profile 





Validity 
coefficient 
With- 


Scale draw Failure 





Ascendancy 
Responsibility 
Emotional Stability 
Sociability 

Total Score 


03 
03 
04 
.00 
04 


Nuun 
MAAunSO w 


_ 





In summary, the scales of the Gordon Per- 
sonal Profile were not related to the behavioral 
criteria employed here. Recognizably, the 
criteria are complex, but at least some positive 
relationship should have been found. These 
data are inconsistent with the contention that 
the Responsibility scale differentiates indi- 
viduals who are unable to stick to a task. 
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This study represents an attempt to explore 
the effects of the drug chlorpromazine on sev- 
eral different aspects of the learning process 
and on verbalized social adaptation as these 
take place in a group of hospitalized chronic 
psychotic subjects (Ss). The learning vari- 
ables sampled included serial verbal learning, 
retention of serial verbal material learned on 
the previous day, the acquisition of a motor 
skill, reminiscence of this motor skill, problem 
solving, and a somewhat related task involv- 
ing verbalized social adaptation and judg- 
ment. These tasks would seem to tap various 
levels of the learning process, ranging from 
simple motor learning to problem solving and, 
finally, “social adaptation.” Inasmuch as this 
was an exploratory study, no directional hy- 
potheses were offered. 


Method 
Subjects and Treatment 


Fifty-two chronic psychotic male veterans, 
committed to a Veterans Administration Hos- 
pital, were studied. Ss were chosen by the 
medical staff on the basis of judgments that 
they were physically suitable for chlorpro- 
mazine treatment, that they would be able to 
cooperate when tested with a battery of sim- 


1 This article is based upon part of a dissertation 
submitted to the Graduate School, Vanderbilt Uni- 
versity, in partial fulfillment of the requirements for 
the Ph.D. degree by the senior author and written 
under the direction of the junior author. The study 
was carried out at the Veterans Administration Hos- 
pital, Murfreesboro, Tennessee. Much aid was given 
by F. H. Deter, H. B. McTyre, and other members 
of the hospital staff. 

2Now at Central State Hospital, Nashville, Ten- 
nessee. 


ple psychological tests, and that they were 
free from organic disturbances which might 
influence their behavior. The majority of the 
Ss carried the diagnosis of one of the subtypes 
of schizophrenia, but there were several who 
were classified as manic-depressive. Ss were on 
the Continued Treatment Service where they 
received the same routine ward care to which 
they had become accustomed. None of the Ss 
received any other type of somatic therapy 
during the period of study, but several Ss 
were members of an activity group. Prior to 
treatment, Ss were randomly divided into two 
groups; one of which was later designated as 
an experimental group and the other as a 
control group. The physician who took charge 
of ordering the medication arbitrarily desig- 
nated these groups. No other hospital person- 
nel were aware of which was the experimental 
group and which was the control group. The 
mean age for the experimental group was 38 
and for the control group was 39. The mean 
duration of hospitalization was 8.9 years for 
the experimental group and 7.8 years for the 
control group. 

All Ss in the experimental group were placed 
on oral chlorpromazine for a period of two 
months. Dosage was gradually increased to 
about 800 mg per day, usually over a period 
of about 35 days, and then gradually de- 
creased until evaluation, at which time 36 Ss 
were receiving a minimum of 300 mg, and 16 
Ss were receiving 200 mg per day. Equal num- 
bers of control and experimental Ss were rep- 
resented in each minimum dosage group. Ss 
in the control group received placebos pre- 
pared by the Smith, Kline, and French Drug 
Co. All Ss received the same routine labora- 
tory procedures. 
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Procedure 


All Ss were tested individually in two ses- 
sions on two consecutive days. 


A pursuit rotor apparatus was used to measure 
motor skill acquisition. The turntable was set to run 
in a clockwise direction at 40 rpm. A chronoscope 
measured the “time on target.” Instructions were fol- 
lowed by 8 trials of 45 sec. each, with a 15-sec. rest 
interval between trials. If S refused to do the task in 
spite of urging, a subjective judgment was made as 
to whether he would be able to cooperate on any of 
the other tasks. Sometimes the testing procedure had 
to be stopped at this point. 

A Lane-type memory drum was used in the verbal 
learning and retention tasks. The following 8-item 
list was learned by the serial anticipation method: 
sand, tool, dust, inch, pain, robe, junk, year. Each 
word was exposed for 3 sec., and individual trials 
were separated by a 6-sec. rest interval. The word 
list was exposed 20 times, or until it had been 
learned to a criterion of three consecutive complete 
repetitions, whichever occurred first. 

After completing the serial learning task, S re- 
turned to the pursuit rotor where he received five 
additional trials. 

A questionnaire called the Verbalized Social Adap- 
tation Questionnaire was developed specifically for 
the study.? The questionnaire consisted of a series of 
statements which presumably contained a principle 
by which social cooperation could be elicited. Each 
statement was followed by a question with two al- 
ternative answers. The answer which made use of 
the “principle” given in the statement was consid- 
ered correct. Following is an example of one of the 
items: “A person who is lost in a strange city can 
often find out how to get where he wants to go by 
asking someone who lives there. If you were lost in 
a strange town, would you: (a) Ask directions from 
a policeman, or (b) Look carefully at all the street 
signs until you found the one you wanted?” Thirty- 
six such items were formulated. The 36 questions 
were divided into two forms of 18 questions each. 
The items were read to each S individually. It was 
felt that a “correct” answer implied a higher level 
of social adaptation, or judgment, than did an “in- 
correct” answer. One-half of the Ss received Form I 
of the questionnaire on the first day, and the other 
half received Form II on the first day. 

Upon his return to the testing room on the second 
day, S was first given 10 additional serial anticipa- 
tion trials, hereafter called retention trials, with the 
same word list which he had learned the previous 
day. 

After completing this task, S worked on the Mak- 


8 George E. Copple initiated the concept of the 
Verbalized Social Adaptation Questionnaire and de- 
veloped several of the items. A copy of the question- 
naire has been deposited with the American Docu- 
mentation Institute. Order Document No. 5583, re- 
mitting $1.25 for 35-mm. microfilm or $1.25 for 6 
by 8 in. photocopies. 
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ing the Last Draw Problem (3, p. 140). This prob- 
lem was chosen because of its comparative simplicity, 
and its difficulty level is easily altered. The first 
phase of the problem used four matchsticks as ma- 
terial. If the S was able to solve this task three con- 
secutive times out of five trials, he was asked whether 
or not he could give a rule whereby he could tell 
someone else how to win the game every time. Those 
who were unable to solve the problem three con- 
secutive times were told the principle and then given 
three additional trials. For Ss who were able to solve 
the first problem, the next level utilized five match- 
sticks as materials; those who failed the first prob- 
lem, even with help, did not continue to the second 
level. The five-stick problem was administered in the 
same manner as the four-stick problem. For those 
who solved the second problem, the last problem 
utilized seven matchsticks, and it was administered 
in the same way the first two problems had been. 

The last measurement taken on the second day was 
the alternate form of the social adaptation question- 
naire, and it was administered in the same manner 
as it had been the day before. 


Results 


During the course of the testing it was im- 
mediately apparent that many Ss could not 
perform on one or more of the tasks. Some Ss 
were able to cooperate on one of the tasks 
and not on another. The judgment as to 
whether or not an S was testable was made 
on a subjective basis. The judgment was, how- 
ever, based on the observed behavior of the S, 
and if there was any question as to his fitness 
for undertaking a given task, an attempt was 
made to gain his cooperation. At the time of 
the testing, E was not aware as to which indi- 
viduals were being given the drug and which 
were receiving placebos. 

It is felt that the “testable—not testable” 
factor is itself an important source of infor- 
mation as to the broader behavioral effects of 
the experimental drug. A more direct measure 
of the effect of chlorpromazine on learning 
performance per se is obtained by comparing 
the performance recor 's for those Ss in each 
group who were able to undertake each task. 


Serial Verbal Learning 


Twenty Ss in the chlorpromazine group 
were testable on the verbal learning task, 
while only nine of the placebo group Ss were 
testable. The resulting chi-square figure of 
9.6 is significant at less than the .01 level. 

When testable Ss in each group were com- 
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pared on individual original learning and re- 
tention learning trials, none of the results ap- 
proached significance. There was also no sig- 
nificant difference between groups when the 
sums of individual scores were compared. The 
Median Test was used to make these com- 
parisons. 


Motor Skill Acquisition 


A significantly larger number of Ss in the 
chlorpromazine group, 24, than in the placebo 
group, 16, were found to be able to undertake 
the pursuit rotor task (y* = 6.9, p< .01). 

When the nonparametric White’s Test was 
applied to each of the trials and also to the 
sums of individual scores on all of the trials 
for both the original learning phase and the 
reminiscence phase, the results did not indi- 
cate a significant difference between groups. 


Problem Solving 


Twenty-one chlorpromazine Ss were test- 
able on the problem-solving task, while only 
15 placebo Ss were testable. However, the re- 
sulting chi square of 3.6 is not considered sig- 
nificant for purposes of this study. No sig- 
nificant differences were obtained when the 
groups were compared on the number of cor- 
rect solutions to each of the problems by 
means of the chi-square test. 


Verbalized Social Adaptation 


When the two groups were compared on the 
testable-not testable split by means of the 
chi-square test, significantly more Ss in the 
chlorpromazine group, 23, were testable than 
in the placebo group, 16 (y? = 6.3, p < .02). 

In analyzing the data in this section, the 
Median Test was used in preference to White’s 
Test because the large number of ties pre- 
cluded the efficient application of White’s 
Test. When the groups were compared with 
regard to the number of correct items on 
both forms of the questionnaire, the groups 
differed at less than the .05 level. The groups 
were also significantly different when com- 
pared on each form separately. In all in- 
stances, the chlorpromazine group had signifi- 
cantly more items correct. 

In order to obtain some check on the reli- 
ability of the two forms of the questionnaire 
and also to get some indication of the reli- 


Table 1 


Summary of Statistical Findings 


Testable 


S’s only* 


Testable— 
not testable 


Serial verbal 

learning 
Original 

Retention 


x?<.01 
x? <.01 


Pursuit rotor 
learning 
Original 
Reminiscence 


Problem solving 
Four sticks 
Five sticks 
Seven sticks 


Verbalized Social 
Adaptation 
Form I 
Form IT 
Combined forms 


MT <.05 
MT <.05 


2 <.02 MT <.05 


® The test employed and the obtained level of significance is 
indicated in each case. MT indicates Median Test; WT indi- 
cates White's Test; NS indicates not significant at the .05 level. 


ability of the individual’s responses to the 
two forms, a Pearson correlation coefficient 
between the two forms was calculated. The 
resulting coefficient of .85 is significant at less 
than the .01 level. 


Summary of Statistical Findings 


Table 1 presents a brief summary of the 
statistical findings. Rather than presenting the 
findings for the individual trials on the verbal 
and motor learning tasks, the table shows the 
results for the sums of individual scores on 
all trials combined. The other table headings 
refer to previously mentioned tests. 


Discussion 


Effects of Chlorpromazine on Level of Coop- 
eration 


Because of the differences between the two 
groups on the testable—not testable factor on 
all of the tasks except the problem-solving 
task, it is inferred that chlorpromazine in- 
creases the motivational level of chronic psy- 
chotic patients to the extent that they are 
willing (able) to try the tasks which were 
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presented. Although the drug seems to have 
little effect on the learning processes them- 
selves, it appears to have a definite effect on 
motivational factors associated with the proc- 
ess of learning. 

One of the reasons that previous studies 
have indicated such a large discrepancy be- 
tween the learning of psychotic groups and 
normal control groups may be that most re- 
search workers have not adequately controlled 
for differences in levels of motivation and co- 
operation. The findings of the present study 
regarding the differences between the two 
groups on the testability factor would seem 
to lend some support to the hypothesis that 
motivation is one of the key factors in the 
learning deficit found in psychotic groups. It 
may not be that the chronic psychotic pa- 
tients are unable to learn as efficiently as 
normals but, rather, that they are more handi- 
capped by the pressure to reduce anxiety or, 
as Hunt (2) suggested, social approval has 
lost its reward value. 


Effect of Chlorpromazine on Learning Tasks 


There were no significant differences be- 
tween the two groups on the serial verbal 
learning task. This finding suggests that 
chlorpromazine does not significantly affect 
the learning processes involved in verbal 
learning but, rather, that it does have some 
influence on the motivational aspects of such 
learning. It was further noted that the learn- 
ing curves for these samples of chronic psy- 
chotic Ss were similar to those reported with 
other types of Ss. 

The results of the present study would seem 
to indicate that chlorpromazine has little ef- 
fect on the acquisition of simple motor skills 
among those Ss who were able to undertake 
the task. In spite of clinical reports to the 
effect that chlorpromazine induces general 
psychomotor retardation, this retardation did 
not significantly decrease the motor learning 
ability of this group. It may be that there 
was a two-factor effect operating on the mo- 
tor task such that the psychomotor retarda- 
tion produced by chlorpromazine lowered the 
rate of learning while the increased motiva- 
tion and interest enabled the group to do as 
well as the control group. It was also noted 
during the course of the testing that sev- 
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eral Ss whose behavior and appearance was 
“Parkinsonian” were still able to learn the 
motor task at about the same rate as other Ss. 

On the problem-solving task, the groups 
did not differ significantly even on the test- 
able—not testable split. It will be noted, how- 
ever, that the groups differ at the .10 level 
in the direction that more Ss in the chlor- 
promazine group were testable. Apparently, 
chlorpromazine has little effect on this par- 
ticular problem-solving task. 


Verbalized Social Adaptation 


There were statistically significant differ- 
ences between the two groups on all phases 
of the Verbalized Social Adaptation Question- 
naire. It should be recognized that the only 
validity claimed for this instrument is its face 
validity. 

In formulating the rationale for the ques- 
tionnaire it was felt that the items might have 
some relevance to what might be called 
“readiness for psychotherapy.” The ability to 
make generalizations from the therapeutic 
situation to a particular problem in life would 
seem to be a necessary, but not a sufficient, 
requisite for therapeutic progress. It would 
seem that patients who were unable to cor- 
rectly answer most of the items on this ques- 
tionnaire would be poorer therapeutic risks 
than would those patients who answered most 
of the items correctly. If one accepts this 
rationale, then the conclusion that psychotic 
patients receiving chlorpromazine are more 
accessible to psychotherapy does not seem 
unwarranted. 

A secondary rationale for the relevance of 
the items to “readiness for psychotherapy”’ is 
that those individuals who are more aware of 
and concerned with their social environment 
would seem to be able to form a more satis- 
factory therapeutic relationship than patients 
who do not have this interest. One of the most 
difficult problems in doing psychotherapy with 
chronic psychotic patients is to arouse their 
motivation and interest in the social environ- 
ment to the extent that it will be possible to 
build a therapeutic relationship. The find- 
ings on this questionnaire would seem to in- 
dicate that chronic psychotic patients receiv- 
ing chlorpromazine are more concerned with 
their social environment. 
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Implications for Psychotherapy 


If it is assumed that chlorpromazine exer- 
cises its beneficial effects by reducing over- 
all anxiety, then some speculations regard- 
ing theoretical implications for psychotherapy 
may be made. If the psychotherapeutic inter- 
view is viewed as a learning situation, and if 
the learning tasks used in this study have any 
relation to the type of learning that takes 
place in therapy, these data should offer some 
information which may be integrated into the 
theory of anxiety reduction in psychotherapy. 

Assuming that chlorpromazine reduces anx- 
iety in the chronic psychotic, the finding that 
there were no significant differences between 
the two groups on the verbal, motor, and 
problem-solving tasks seems to be contrary 
to the hypothesis that a reduction in anxiety 
level would result in more effective learning. 
However, the finding that the chlorpromazine 
groups are more cooperative and seemingly 
more interested in their social environment 
suggests that while anxiety reduction may 
not directly influence the learning processes, 
it may influence other relevant factors which 
make new learning possible. 


Summary 


The effects of chlorpromazine upon the per- 
formance of chronic psychotic Ss on several 
learning and retention tasks were studied. 
Fifty-two Ss were randomly divided into two 
groups, an experimental group which received 
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chlorpromazine for a period of two months 
and a control group which received placebos. 
At the end of this period Ss were examined on 
a serial verbal learning task, a motor learn- 
ing task, a problem-solving task, and on a 
questionnaire called the Verbalized Social 
Adaptation Questionnaire. The following con- 
clusions were drawn: 

1. Chlorpromazine appears to have little 
effect on the learning processes per se. 

2. Chlorpromazine has a significant effect 
on improving the responses of chronic psy- 
chotics to items judged to involve social adap- 
tation. 

3. Chlorpromazine significantly increases 
the number of chronic psychotic Ss who are 
motivated to cooperate in the testing pro- 
cedure. 

4. In spite of reports to the effect that 
chlorpromazine produces general psychomotor 
retardation, Ss receiving the drug learn a mo- 
tor task at the same rate as Ss who are not 
receiving the drug. 


Received August 15, 1957 
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Stylus Maze Performance of Chronic Schizophrenics 
Taking Chlorpromazine’ 


Paul G. Daston’* 


Veterans Administration Hospital, Brockton, Massachusetts 


Conflicting reports have been published re- 
garding Porteus maze performance of chronic 
schizophrenics treated with chlorpromazine 
(CPZ). The present study employs a stylus 
maze to investigate effects large doses of 
CPZ may have on performance of chronic 
schizophrenics when practice effect is mini- 
mized. 

Ss were 16 chronic schizophrenics, divided 
into CPZ (1200 mg. daily) and placebo (PL) 
groups. Ritalin® was added for CPZ Ss the 
first 15 days and for PL Ss instead the last 
4 days. All had 27 previous weekly trials, with 
performances approaching an asymptote. 
Groups were equated for performance and 
tested daily for 19 days. Apparatus was a 
Lafayette stylus maze, wired to record er- 
rors. The stylus traced the path taken on a 
paper schematic placed under the maze, thus 
recording extent of blind alley penetration. 
Time was kept by stop watch. Ss were told 
to take the shortest path to the goal without 
touching the sides, to avoid blind alleys and 
to work as fast as possible. 


1An extended copy of this report may be ob- 
tained without charge from Paul G. Daston, Clinical 
Psychology Section, Veterans Administration Hos- 
pital, Durham, N. C., or for a fee from the Ameri- 
can Documentation Institute. Order Document No. 
5691, remitting $1.25 for microfilm or $1.25 for 
photocopies. 

2 Now Chief Psychologist at Durham (N. C.) Vet- 
erans Administration Hospital. 

8 Ritalin, a “psychic energizer,” purports to mini- 
mize such side effects of CPZ as drowsiness and 
mood depression 


As time and error scores were related (Ss 
taking more time tended to make fewer er- 
rors), each measure was transformed into 
standard scores and both combined. Analyses 
of variance, using difference scores, revealed 
no significant differences between groups, 
either for Ist vs. 15th day, 15th vs. 19th, or 
Ist vs. 19th day. In no case did the F ratio 
exceed 1, indicating CPZ, with or without 
Ritalin, had little effect on performance or, 
inferentially, perceptual-motor coordination. 
However, inspection of means revealed CPZ 
group performance remained about the same 
with or without Ritalin, whereas PL mean 
scores improved with its addition. Also, chi- 
square tests of direction of change showed a 
significant shift (p= .05) favoring PL im- 
provement. 

It was felt extent of blind alley penetration 
might give a measure of judgment. A number 
score was assigned to each 16th inch of blind 
alley penetration. Statistically significant dif- 
ferences were found between groups for the 
first measure, so analysis of covariance was 
used. Differences were not significant for 
either 15th or 19th day, indicating judgment 
was not affected differentially by CPZ. 

In summary, perceptual-motor coordination, 
as measured, was not affected significantly by 
large doses of CPZ. PL errors tended to de- 
crease. Judgment, denoted by extent of pene- 
tration into blind alleys, was not differentially 
affected. Ritalin effect warrants further study. 


Brief Report. 
Received May 19, 1958. 
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The Effect of Differential Motivating Instructions on 
the Emotional Tone and Outcome of TAT Stories’ 


Harriet C. Sumerwell, Mary M. Campbell, 
and Irwin G. Sarason * 


University of Washington 


This study was carried out to achieve two 
goals. The first was to determine the extent 
to which we could replicate certain of Eron’s 
(1, 2, 3) findings concerning his scales for 
scoring Emotional Tone and Outcome on the 
TAT. In particular, we were interested in 
whether or not the reliabilities of raters and 
pull values of the TAT cards in our study 
would approximate those obtained by Eron 
in his normative study. 

Our second objective was to assess the sensi- 
tivity of Eron’s Emotional Tone (ET) and 
Outcome (O) scales to experimental instruc- 
tions introduced prior to obtaining subjects’ 
stories to the cards. One of the widely used 
sets of TAT instructions in research and clini- 
cal work is that of Murray (4). Therefore 
Murray’s instructions (adapted for group ad- 
ministration) constituted one of four sets of 
instructions used in this experiment. In ex- 
amining Murray’s instructions, it seemed to 
us that they either directly or indirectly pro- 
vide the subject with information which might 
lead him to draw the inference that the test 
he is about to take is some combination of 
an intelligence and personality test. We at- 
tempted to separate these two possible effects 
by employing another set of instructions in- 
forming the subject he was about to take an 
intelligence test and still another set inform- 
ing the subject he was to take a personality 
test. Our fourth set consisted of neutral in- 
structions which merely informed the subject 

1 Portions of this paper were presented at the 1957 
meeting of the Western Psychological Association. 


2 Irwin G. Sarason is responsible for the manuscript 
in its present form. 


of the procedure he was to follow in respond- 
ing to the cards. 

For the present purposes we assumed that 
the ET and O ratings would reflect the de- 
gree of stress experienced by the subject and 
that the lower the ratings (i.e., greater dys- 
phoria) the greater was the subject’s reaction 
to stress. Our hypothesis concerning the ef- 
fects of the differential instructions was that 
the Murray instructions would be the most 
stressful of the four sets used and that the 
neutral instructions would be the least stress- 
ful. Consequently, we expected relatively low 
ET and O ratings under the Murray instruc- 
tions and relatively high ratings under the 
neutral instructions. The ET and O ratings 
for the Projective and Intelligence instructions 
were expected to fall between those obtained 
under the Murray and Neutral instructions. 


Method 
Subjects 


The Ss were 196 volunteers from introduc- 
tory psychology classes at the University of 
Washington. The total number of Ss in each 
group was as follows: Murray—20, Intelli- 
gence—22, Projective—42, and Neutral—22. 
Half the Ss in each group were tested by one 
E, and half by the other EZ. There were equal 
numbers of males and females in the Murray, 
Intelligence, and Neutral groups, although we 
had made no systematic attempt to control 
for the sex variable. In the Projective group, 
females outnumbered males 29-13. 


Procedure 


Eight TAT cards—1, 2, 4, 6BM, 13MF, 
16, 18GF, and 18B M—were presented to each 
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group of Ss by projection upon a screen after 
the appropriate instructions had been given. 
Each picture was exposed for five minutes, 
during which time the Ss wrote their stories. 
The four sets of instructions were adminis- 
tered as follows: 


1. Murray instructions. This is a test of imagina- 
tion, one form of intelligence. I am going to show 
you some pictures, one at a time; and your task will 
be to make up as dramatic a story as you can for 
each. Tell what has led up to the event shown in 
the picture, describe what is happening at the mo- 
ment, what the characters are feeling and thinking; 
and then give the outcome. Write your thoughts as 
they come to your mind. Do you understand? Here 
is the first picture. 

2. Intelligence test instructions. Being college stu- 
dents you are probably accustomed to taking intelli- 
gence tests. This is a special test of intelligence and 
the ability to think creatively. Rather than measur- 
ing specific abilities, this test gets at innate capacity 
and potentialities. I am going to show you certain 
pictures, one at time; and your task will be to make 
up a story for each. Tell what has led up to the 
event shown in the picture, describe what is happen- 
ing at the moment, what the characters are feeling 
and thinking; and then give the outcome. Write your 
thoughts as they come to your mind. Do you under- 
stand? Here is the first picture. 

3. Personality test instructions. You have probably 
heard of projective techniques and tests. The test you 
are about to take is the Thematic Apperception Test. 
It is a projective test in the sense that you uncon- 
sciously project your inner feelings and motives into 
the stories you tell. I am going to show you some 
pictures, one at a time; and your task will be to 
make up a story for each. Tell what has led up to 
the event shown in the picture, describe what is hap- 
pening at the moment, what the characters are feel- 
ing and thinking; and then give the outcome. Write 
your thoughts as they come to your mind. Do you 
understand? Here is the first picture. 

4. Neutral instructions. I am going to show you 
some pictures, one at a time; and your task will be 
to make up a story for each. Tell what has led up 
to the event shown in the picture, describe what is 
happening at the moment, what the characters are 
feeling and thinking; and then give the outcome. 
Write your thoughts as they come to your mind. Do 
you understand? Here is the first picture. 


The four sets of instructions were group- 
administered by two different Es. Thus there 
were eight experimental groups in the experi- 
ment, and it became possible to evaluate the 
effects of the experimenter difference. 

After all the data had been gathered, the 
stories for all Ss were independently scored by 
the two Es for ET and O. The scoring pro- 
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cedure was so arranged that neither EZ had 
knowledge of the group to which the S, whose 
protocol was being scored, belonged. Disagree- 
ment between raters was resolved for the pur- 
poses of the statistical analysis of the data by 
the third author (Sarason) who had had con- 
siderable experience in using Eron’s scales. 


Results 


Pearson product-moment correlations be- 
tween the ratings (made prior to resolution 
of disagreement) of the two examiners were 
817 for ET and .808 for O. These correla- 
tions are similar to those obtained by Eron 
(1, 2, 3) and seem to justify increased confi- 
dence in the use of this scoring system for ex- 
perimental purposes.* In general, all of the 
cards tended to elicit stories negative in both 
ET and O, a finding which again corroborated 
Eron’s and Sarason and Sarason’s earlier re- 
sults. 

The effects of the experimental instructions 
and the two examiners were evaluated by 
means of analysis of variance. The measure 
employed for each S was the sum of the rat- 
ings over all cards administered. Table 1 pre- 
sents the means and SDs of ET and O rat- 
ings as a function of the experimental condi- 
tions. Preliminary comparisons of the eight 
experimental groups indicated that the as- 
sumption of homogeneity of variance could 
be met in our statistical tests. 

In general, no significant difference between 
examiners was found for either the ET or O 
ratings. There was, however, a significant dif- 
ference at the .01 level between examiners for 
O under Neutral instructions. 

Since there were no significant differences 
between examiners for ET, the data were 
pooled into the four instruction groups. A sig- 
nificant difference at the .01 level among these 
groups was found. Significant differences be- 
tween the four pooled groups under O were 
also obtained at the .01 level. However, since 
significant examiner differences under the 
Neutral condition for the O ratings had been 


8 The correlations obtained are also similar to those 
obtained by Sarason, I. G. and Sarason, B. R. in an 
unpublished study, “The effect of type of administra- 
tion and sex of subject on emotional tone and out- 
come ratings of TAT stories.” 





Effect of Differential Instructions on TAT Stories 


Table 1 


Means and Standard Deviations of Emotional Tone and Outcome Ratings as a 
Function of Instructions and Examiners 








Examiner No. 1 
Measure and 


instructions Mean SD 


Average of 2 


Examiner No. 2 Examiners 


Mean SD 





Emotional Tone 


Murray 
Projective 
Intelligence 
Neutral 


—8.90 
—8.57 
—7.7 


—6.00 


2.02 
2.36 
2.41 
1.84 


Outcome 

—4.10 
—3.24 
—3.45 
+ .82 


3.70 
3.27 
3.83 
3.25 


Murray 
Projective 
Intelligence 
Neutral 


—9.20 
—8.29 
—8.55 


—6.73 


2.39 
2.69 
2.34 
2.28 


2.50 
2.36 
2.06 


—4.80 
—2.19 
—1.73 
—1.91 


3.85 
4.04 
3.74 
2.02 


3.69 


3.80 
2.99 





Note.—The measure for each S was the sum of ratings over all cards administered. 


found, a further analysis of variance of O 
ratings was performed omitting the Neutral 
group. When the three remaining groups were 
compared, no significant differences were dis- 
covered among the three remaining groups. 

When ¢ tests were performed, it was found 
that for the ET ratings, the Murray, Projec- 
tive, and Intelligence groups differed signifi- 
cantly from the Neutral group at the .01, 
01, and .05 levels of confidence respectively. 
The Murray, Projective, and Intelligence 
groups did not differ significantly among 
themselves. 

With respect to the O rating, the Murray 
and Projective groups were found to differ 
significantly from the Neutral group (.01 and 
.05 levels respectively). The Intelligence group 
did not differ significantly from the Neutral 
group. As already mentioned, the Murray, 
Projective, and Intelligence groups did not 
differ among themselves on the outcome rat- 
ings. 

It seemed of interest to compute the corre- 
lation between ET and O. This r was found 
to be .385, which was low but significant at 
the .01 level. The small size of this correla- 
tion seems to be accounted for by the greater 
variability of O ratings, similarly reported by 
Eron, and the observation that Ss occasion- 
ally tended to give a happy ending quite out 
of line with the initial depressing mood of 
their story. 


A separate analysis of card differences was 
also performed. It was found that the scores 
for individual cards differed significantly be- 
yond the .001 point for ET under the four 
instructional conditions. For O, there were 
significant card differences at the .001 point 
for Murray, Personality, and Intelligence 
groups, but no significant differences under 
the Neutral group were found. This again 
appeared to be related to examiner differ- 
ences under the Neutral condition. 

In testing for the effect of instructions upon 
cards, the coefficient of concordance was used, 
the cards being ranked on the basis of their 
mean scores obtained under each of the ex- 
perimental conditions. It was found that in- 
structions did not alter the ranks of the cards 
significantly. Thus it appears that a card re- 
tains its original pull value relative to the 
other cards in spite of the over-all effect of 
varied instructions. 


Discussion 


With respect to the hypotheses we made 
concerning our experimental procedure, it ap- 
pears as though they were supported by our 
results. Happier emotional tone and outcome 
were more associated with the Neutral in- 
structions than with the other instructions 
employed. Although this study does not pro- 
vide sufficient basis for delineating precisely 
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the reactions of the Ss to the instructions, it 
seems reasonable that negative ET and O 
scores are associated with emotional upset, 
tension, and dysphoria. In this regard we 
wonder whether instructions such as Murray’s 
do not impose undue stress and structure on 
the testing situation. We generally interpret 
S’s reactions to the TAT in terms of the 
test stimuli presented to him and in terms 
of characteristics peculiar to the individual. 
However, there may be a third variable we 
overlook in test administration, namely, the 
influence of our preliminary instructions on 
S’s behavior. If our aim in testing is to re- 
duce the degree of structure in the test situa- 
tion, perhaps instructions similar to those we 
would call Neutral would be more appropriate. 

Our results have provided further basis for 
employing Eron’s rating scales for research 
purposes. With respect to reliability and other 
characteristics, our results are generally con- 
sistent with those reported by others. It would 
seem useful in future research to study the 
relationship of these and other rating scales 
to a variety of behavioral criteria and experi- 
mental procedures. 

Frequently widely different stories achieved 
similar ratings on Eron’s scale. We feel that 
other rating scales designed to tap a variety 
of personality dimensions and content areas 
might yield pertinent information about Ss’ 
differences to which the ET and O scales are 
relatively insensitive. Further research might 
be directed toward devising other scales and 
later towards investigating the meaning of S 
differences along these dimensions. 
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Summary 


The effects of four kinds of instructions and 
of two different examiners on TAT emotional 
tone and outcome ratings were studied. One 
group of Ss received Murray’s instructions; 
one a set of instructions informing Ss they 
were about to take a test of intelligence; an- 
other a set of instructions informing Ss they 
were about to take a personality test; and a 
fourth group of Ss received neutral prelimi- 
nary instructions. 

Adequate interrater reliability for emotional 
tone and outcome was obtained. A difference 
due to examiners was found only for outcome 
ratings under the neutral condition. 

The Murray, personality, and intelligence 
instructions led to more depressive, sadder 
stories than did the neutral instructions. The 
implications of this finding with respect to 
TAT testings were discussed. It was concluded 
that neutral instructions might be the most 
appropriate kind of instructions to give Ss 
prior to taking the TAT. 
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When patterns of responses to various 
stimuli are examined, the responses often are 
biased and do not follow a normal probabil- 
ity distribution. Such “set” or einstellung phe- 
nomena tend to appear most clearly as the 
stimulus pattern is unstructured. There are 
myriad forms of such biased responses, such 
as the tendency to call “heads” on the first 
toss of a coin, the tendency to turn right when 
one may turn left or right, etc. (8, 14). In 
psychological tests, Cronbach (12, 13) has 
identified a large number of such biases and 
called them response sets. 

Since there is evidence that such sets or 
biases are expressions of personality traits (1, 
3, 6, 11, 15, 19, 21, 22, 23, 24), Berg (5) 
sought to employ these patterns of bias as 
measures of personality but with only limited 
success. However, when attention was shifted 
from the pattern of bias to those responses 
which departed from the established set or 
bias, it was found possible to use these depar- 
tures as measures of personality and other 
characteristics. Barnes (2), for example, was 
able to construct a series of clinical scales 
using this concept of deviant responses. Thus 
the key to the problem of utilizing response 
bias resides in using, not the pattern of bias 
itself but, rather, the responses which go 
counter to the established bias, i.e., the devi- 
ant responses. In other words, we should not 
pay particular attention to the 80% of the 
population who call “heads” when a coin is 
flipped, but we would closely scrutinize the 
20% who call “tails.” The latter response, in 
our lexicon, is deviant. A fuller account of the 
problem and its background is given in arti- 
cles by Berg (4, 5). 

The present study is one of a series of 


empirical tests of the Deviation Hypothesis 
which has been stated as follows: “Deviant 
response patterns tend to be general; hence 
those deviant behavior patterns which are 
significant for abnormality (atypicalness) and 
thus regarded as symptoms (earmarks or 
signs) are associated with other deviant re- 
sponse patterns which are in noncritical areas 
of behavior and which are not regarded as 
symptoms of personality aberration (nor as 
symptoms, signs, earmarks)” (5, p. 159). In 
other words, by using deviant responses in a 
noncritical area, such as a liking for designs, 
we should be able to measure deviant be- 
havior in a critical area such as psychopa- 
thology, chronic organic disease, employee 
morale, creativeness, etc. 

The specific hypotheses of the present study 
were two: 

1. The response patterns of normal, young 
children to the Perceptual Reaction Test 
(PRT) (7) are significantly different from 
those of normal adults. As a corollary, older 
children have response patterns more similar 
to those of adults than younger children. 

2. The response patterns of normal, young 
children on the PRT more nearly approxi- 
mate those of adult schizophrenics than those 
of normal adults. 

Thus maturity as a critical area of behavior 
is assessed by means of deviant response pat- 
terns as they occur in the PRT. Then, since 
descriptions of schizophrenic behavior (9, 10, 
16, 17, 18, 20) typically make reference to 
immaturity as a common feature of the 
schizophrenic reaction, it is predicted that 
the deviant response patterns of normal, 
young children are similar to those found in 
schizophrenics. 
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Table 1 


Distribution of Subjects by Age, Sex, and 
Place of Residence 








7-0 to 
8-11 


Age and 


residence 


0 to 
10-11 


11-0 to 


12-11 Totals 





Male 
Urban 
Rural 

Female 
Urban 
Rural 


Totals 





Procedure 


In order to elicit deviant response pat- 
terns, the PRT was used. This test consists 
of 60 abstract designs drawn with ruler and 
compass. The subject is required to mark 
one of the following options for each design: 
like much (LM), like slightly (LS), dislike 
slightly (DS), or dislike much (DM). No 
obvious meaning is inherent in any of the 
designs; hence, the test is considered to be 
relatively unstructured, a condition which fa- 
cilitates the appearance of response biases. It 
should be noted that other stimulus patterns 
could be employed. However, the PRT has 
been used as a research instrument for a num- 
ber of years and, accordingly, a large body of 
normative data is available. Also, Barnes’ (2) 
data on PRT responses of adult schizophren- 
ics could be used. 


Three hundred Louisiana grade-school children 
were used as Ss for the present study. See Table 1. 
The PRT was administered to the Ss in their regular 
classrooms during a normal school day. Administra- 
tion time was approximately 12 minutes. The Ss’ re- 
sponses were transferred from the PRT booklets to 
IBM answer sheets for machine scoring. The schizo- 
phrenia scales developed by Barnes (called Sigma 
Scales by him) were used to determine each indi- 
vidual’s score. Barnes’ scales were developed by as- 
signing plus weights to those PRT responses which 
were significantly characteristic of his schizophrenic 
group and assigning minus weights to PRT responses 
which were characteristic of his normal group. The 
sum of these weights was the schizophrenic score. 
These scores were compared on the basis of age and 
place of residence by analysis of variance. The re- 
sulting variances for the different groups appeared 


to be rather heterogeneous; therefore, Bartlett’s test 
for homogeneity of variance was applied to the data. 
Since all of the resulting chi-square values were sta- 
tistically significant beyond the .01 level of prob- 
ability, a nonparametric technique was used to com- 
pare the group schizophrenia scores. Specifically, the 
nonparametric median test seemed to apply to data 
of the type obtained, as it is not dependent upon 
homogeneity of variance. The hypothesis tested is 
that the groups compared are random samples from 
a population with a common median. 


Results 


Median schizophrenia scores on the PRT 
are given for all Ss in Table 2. The results of 
the nonparametric median tests for male Ss 
are shown in Table 3. The first comparisons 
are between normal, male children and nor- 
mal, adult males. The resulting chi-square 
value of 138.14, which is statistically signifi- 
cant at the .0001 level of probability, sup- 
ports the hypothesis that normal male chil- 
dren attain schizophrenia scores on the PRT 
that differ significantly from those of normal, 
adult males. The second comparison was be- 
tween normal, male children and schizophrenic 
adult males. The obtained chi-square value of 


Table 2 


Median Schizophrenia Scores on the PRT for 300 
Normal Children, 850 Normal Adults, and 167 
Schizophrenic Adults Taken Separately 
and as Combined for Nonpara- 
metric Median Tests 





Median Score 








Combined 
with 
schizo- 
phrenic 
aduits 


Combined 
with 
normal 


Subjects adults 





Normal children 


Males (N=150) 
Females (W = 150) 


—14.15 
— 9.62 


+5.80 
+3.36 


Normal adults* 
Males (N=500) 
Females (WN = 350) 


Schizophrenic adults* 


Males (N=99) 
Females (NV = 68) 


+ 1.40 
+ 3.50 





* The data for normal adults and schizophrenic adults are 
from Barnes (2). 
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Table 3 


Nonparametric Median Test for PRT Schizophrenia Score Differences Between Groups of Male Subjects 





Number 


Above 
median 


Below 


Subjects median 


Chi 
square 


Subgroup 
median 


Total 





Normal children 10 


309 


140 


Normal adults 191 


Norraal children 85 


Adult schizophrenics 


Normal children 
7-10 to 8-11 years 
0 to 10-11 years 
11-0 to 12-11 years 


Normal children 


Urban 
Rural 


3.10 is not statistically significant, supporting 
the null hypothesis that the two groups are 
probably samples from a population with a 
common median. Next, a comparison was 
made between three age groups (7-0 to 8-11 
years, 9-0 to 10-11 years, 11-0 to 12-11 
years) of normal children to determine the 
possibility of a significant trend in PRT 


150 
500 


+ 8.50 
— 17.83 


+ 8.50 
+ 1.40 


schizophrenia scores as a function of chrono- 
logical age. The resultant chi-square value of 
7.10, which is statistically significant at the 
.05 level of probability, supports the hypothe- 
sis that the schizophrenia scores of normal 
children decrease significantly with age. The 
final comparison was made between groups of 
normal, urban and rural, male children. The 


Table 4 


Nonparametric Median Test for PRT Schizophrenia Score Differences Between Groups of Female Subjects 
I | ) 





Number 





Above 
median 


Below 
Subjects median 


Chi 
square df p 


Subgroup 
median 


Total 





121 
127 


Norma! children 29 


Normal adults 


Normal children 74 


Adult schizophrenics 


Normal children 


7-0 to 8-11 years 
9-0 to 10-11 years 
11-0 to 12-11 years 


Norma! children 


Urban 
Rural 


+ 3.25 
— 12.81 


150 
350 


80.97 < 0001 


+ 3.25 


150 
68 
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obtained chi-square value of 1.71, which is not 
statistically significant, supports the null hy- 
pothesis that the two groups appear to be 
samples drawn from a population with a 
common median. 

Table 4 presents the results of the non- 
parametric median tests performed to com- 
pare the median differences in scores for the 
Barnes schizophrenia scale obtained on the 
PRT between groups of female Ss. The com- 
parisons made between groups of female Ss 
were the same as those for male Ss. Although 
the chi-square values varied somewhat, the 
probability levels obtained were identical with 
those obtained for male Ss. Thus, the conclu- 
sions reached regarding females were the same 
as those for males. 


Discussion 


The results of the present study support 
the hypothesis, as stated earlier, that the re- 
sponse patterns of normal, young children to 
the PRT are significantly different from those 
of normal adults. Their patterns more nearly 
approximate those of adult schizophrenics 
than those of normal adults. As normal chil- 
dren mature their response patterns gradually 
approach the normal adult patterns for the 
PRT. Thus it appears that these differential 
response pattern biases may be used to assess 
maturity, insofar as the evidence of the pres- 
ent study indicates. 

Since normal children make responses to 
the stimulus patterns of the PRT that are 
not significantly different from those of adult 
schizophrenics, it may be inferred that some 
common factor or factors exist between the 
two groups. As noted earlier, immaturity of 
behavior has frequently been used as a phrase 
characterizing schizophrenics. The results of 
this investigation support this description, at 
least insofar as similarity of performance on 
the PRT of normal children and adult schizo- 
phrenics is concerned. Further, since the 
schizophrenia scores of normal children de- 
crease significantly with age, gradually ap- 
proaching the normal adult level, the schizo- 
phrenia scale could be used as a crude meas- 
ure of maturity. In fact, there is every reason 
to believe that with further investigation, us- 
ing wider age ranges and larger age samples, 
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a reasonably precise measure of maturity 
could be developed. 

As urban and rural children were not sig- 
nificantly different with reference to schizo- 
phrenia scores on the PRT, it would prob- 
ably be unnecessary to develop separate 
norms. It should be noted, however, that the 
actual differences in median scores for the 
two groups, although not statistically signifi- 
cant, were in the same direction for both 
males and females. Urban children scored 
slightly higher than rural children. This dif- 
ference might possibly be found to be sig- 
nificant for larger groups of children. 


Summary 


The present investigation was undertaken 
to determine whether groups differing in 
chronological age could be shown to differ in 
response patterns on the PRT, as predicted 
by Berg’s Deviation Hypothesis. In addition, 
the investigation included a comparison of the 
response patterns for groups of normal chil- 
dren and schizophrenic adults, both being 
characterized by immaturity of behavior, to 
determine whether they might be similar. 
Groups of normal children were compared, 
separately, with groups of normal adults and 
schizophrenic adults. Significant differences 
were found between the response patterns of 
normal children and those of normal adults. 
Normal children and schizophrenic adults did 
not differ significantly. 

A significant trend in response patterns for 
normal children as a function of chronological 
age was found when three consecutive age 
groups were compared. Urban and rural chil- 
dren were found to have response patterns on 
the PRT which were not significantly differ- 
ent. Thus, it was concluded that response pat- 
terns on the PRT are related io chronological 
age, at least in the case of normal children, 
and that, since the response patterns of nor- 
mal children and schizophrenic adults are not 
significantly different, some common factor or 
factors such as immaturity must exist be- 
tween the two groups. Finally, the deviant 
response patterns of normal children were 
sufficiently clear-cut that they could be used 
to assess maturity. 
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Signs of Homosexuality in Human-Figure Drawings’ 


Armin Grams 


DePaul University 2 


and Lawrence Rinder 


Illinois State Training School for Boys® 


This study investigates the validity of the 
15 signs in human-figure drawing which 
Machover (1) lists as predictive of homo- 
sexuality. Fifty adolescent inmates of a state 
training school were divided on the basis 
of homosexual experience into two groups 
matched for age, schooling, IQ, and race. 
Each subject was asked first to draw a per- 
son and then to draw a person of the sex op- 
posite that of the first drawn figure. Three 
psychologists rated the drawings for the pres- 
ence of signs purported to be indicative of 
homosexuality. To facilitate rating, Mach- 
over’s signs were stated objectively as follows: 


. Ear large or heavily lined or much detail 

. Detectable delineation of hips or buttocks 

. Failure to complete drawing below waist 

. Heavy line of demarcation at waist 

. Failure to draw “V” of crotch 

. Presence of shading on lips 

. Pants transparent (legs showing through) 

. Naked presence of sexual organs (genitals only) 
. Trousers only clothing shaded 

. Female figure transparent below waist 


CO MIAN WH 


_ 


1An extended report of this study may be ob- 
tained without charge from Armin Grams, Institute 
of Child Development and Welfare, Minneapolis 14, 
Minnesota, or for a fee from the American Docu- 
mentation Institute. Order Document No. 5691 from 
ADI Auxiliary Publications Project, Photoduplica- 
tion Service, Library of Congress, Washington 25, 
D. C., remitting in advance $1.25 for microfilm or 
$1.25 for photocopies. 

2Now at Institute of Child Development and 
Welfare, University of Minnesota. 

8 Now at Blackhawk Mental Health Center, Water- 
loo, Iowa. 


Male nose large, erased, and redrawn 

12. Phallic foot (length at least three times width 
and/or shaded tip) 

13. Belt shaded and speared to right of figure 

14. Presence of eye lashes 

15. Drawing of female figure first 


The extent to which all judges rated a sign 
present or absent in a drawing became the 
index of rating reliability. Agreement on the 
drawings of homosexuals was 76.5%, on the 
drawings of controls, 83.1%. 

Three of the signs [1, 2, 4] resulted in chi 
squares whose P values were between .10 and 
.20, one [5] resulted in a chi square with a 
P value between .20 and .30, and three [12, 
14, 15] resulted in chi squares whose P values 
were between .50 and .70. The eight remain- 
ing signs resulted in chi squares whose P 
values exceeded .70. Thus, none of the signs 
proved to have individual validity. To deter- 
mine the predictive significance of the 15 
signs taken as a group, the total number of 
signs present in a drawing was correlated with 
homosexuality and nonhomosexual experience. 
A point-biserial r of .15, not significant at the 
.O5 level of confidence, was obtained. Thus, 
neither individually nor collectively did the 
signs studied validly predict the criterion. 
Brief Report. 

Received May 1, 1958. 
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The drawing of human figures is a technique 
which has proved to be of considerable useful- 
ness in the personality evaluation and diag- 
nosis of psychiatric patients. Interpretations 
of figure drawings, however, are based more 
often on clinical intuition and experience than 
on objectively defined or validated criteria. 
As Holzberg and Wexler (1) point out, in 
their validation study of figure drawings, this 
technique remains “much more of an art than 
a science.” 

Despite efforts to validate figure drawings, 
disagreement concerning the significance of 
certain characteristics of the drawings still 
exists. Clearly contradictory are some of 
the findings relating to paranoid indicators. 
Machover (2) includes, as possible indicators 
of paranoid traits, eye emphasis, ear emphasis, 
large figure placed in the middle of the page, 
a large head, and speared or talon-like fingers. 
In a recently published study, Ribler (3) in- 
vestigated two of these indicators, eye and ear 
emphasis, in paranoid schizophrenics com. 
pared with unclassified schizophrenics, anxiety 
neurotics, and “normals.” He found no sta- 
tistically significant differences between the 
diagnostic groups with regard to these two 
variables. 

In endeavoring to interpret these discrepant 
results it is worth noting that Ribler relied 
solely upon diagnostic categories in classifying 
his patients as paranoid or nonparanoid. He 
did not attempt a direct estimation of the 
paranoid component in a patient’s pathology. 
The intent of the current study was to check 
some of the drawing features reported as 
paranoid indicators in a group of patients 
specifically evaluated for the presence and 
extent of paranoid symptomatology. 


Procedure 


Subjects 


In selecting the subjects for this study, a 
number of staff psychiatrists and experienced 
psychiatric residents at the Institute of Living 
were asked to indicate any of their presently 
hospitalized patients who had a history of 
paranoid behavior and were currently para- 
noid. They were also asked for a list of pa- 
tients who had no history of paranoid behavior 
and currently evidenced no paranoid ideation. 
When a patient was classified as paranoid, the 
psychiatrist was requested to rate the degree 
of paranoid involvement as “mild,” “moder- 
ate,” or “marked.” The following definition 
of paranoid trends, adopted in a modified form 
from the Diagnostic and Statistical Manual 
of Mental Disorders prepared by the Ameri- 
can Psychiatric Association, was used as a 
guide by the psychiatrists in categorizing their 
patients: 


Thinking characterized by the use of projection as a 
defense mechanism. Behavioral manifestations are 
suspiciousness, envy, extreme jealousy and stubborn- 
ness. In more flagrant cases, ideas of reference and 
delusions of persecution and/or grandeur are evident 


All patients who were under 15 or over 60, 
of below average intelligence, or having any 
type of organic involvement were excluded 
from the study. The actual sample consisted 
of 61 hospitalized patients, 31 described as 
paranoid and 30 as nonparanoid. The mean 
ages of the paranoid and nonparanoid groups 
were 31.8 and 31.5 years respectively. Both 
groups contained patients having neurotic and 
psychotic diagnoses. In the paranoid group, 12 
patients were judged to be “mildly paranoid,” 
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9 “moderately paranoid,” and 10 “markedly 
paranoid.” 

Drawings of both a male and female figure 
were already available on a number of these 
patients who had been referred for routine 
psychological testing prior to this study. The 
status of these patients at the time of testing 
was obtained from their physicians. The re- 
maining patients, who had not previously re- 
ceived tests, were individually asked to draw 
human figures. 


Scoring Procedure 


A list of 26 carefully defined characteristics 
of figure drawings, which might discriminate 
paranoid from nonparanoid patients, was com- 
piled from Machover’s indicators, the authors’ 
clinical experience, and relevant items taken 
from the check list used by Holzberg and 
Wexler. These items are as follows: 


1. Careful detailing of eyelashes 
. Careful detailing of eyebrows 
3. Line emphasis on outline of the eyes (rein- 

forcement) 

. Shading of the eyes 

. Two eyes in profile view 

. Eyes represented by circles 

. Eyes represented by crosses 

. Eyes represented by dots 

. Eyes represented by dashes or curves 

. Unseeing eyes; eyes without a pupil 

. Unusual detailing and/or articulation of eye 

. Eyes absent 

. Disproportionately large ears 

. Line emphasis on outline of ears (reinforce- 
ment) 

. Ears where none should be present 

. Shading of the ears 

. Ear misplaced in relation to other head features 

. Ears absent 

. Unusual articulation and/or detailing of ear 

. Dark or heavy line emphasis overall 

. Contrasting pressures of line; light and heavy 
pressures 

. Shading of areas other than eyes and ears 

. Speared or talon-like fingers 

. Disproportionately large head 

. Clothing elaborations to conceal some features 
of the figure (cloak, cape, any unusual clothing) 

. Size of figure (actual length measurement) 


With all 61 sets of drawings randomized 
and any information relating to group clas- 
sification removed, the drawings were scored 
independently by each of the authors. A plus— 
minus scoring system was used to indicate the 
presence or absence of the first 25 of the in- 


Marvin Reznikoff and Alma L. Nicholas 


dicators in a patient’s set of two drawings. 
For Item 26, an actual measurement of the 
first figure drawn was used. 

There was 93% agreement between the in- 
dependent ratings of the two judges. Differ- 
ences in scoring were reviewed, and agreement 
was reached with regard to the appropriate 
score. 


Treatment of Data 


The number of times each of the first 25 
indicators occurred in the two groups was 
tabulated, and comparisons between the groups 
were made using chi square. A ¢ test was em- 
ployed to determine whether the paranoid 
patients differed significantly from the non- 
paranoids in the size of their figure drawings. 

A drawing score was obtained for each pa- 
tient representing the number of items out of 
the total of 25 scored present. These scores 
were converted to percentages and transformed 
to equalize the variances. A ¢ test was then 
utilized to compare the paranoid and non- 
paranoid groups. For the paranoid group, the 
transformed scores were distributed into the 
“mildly paranoid,’ “moderately paranoid,” 
and “markedly paranoid” subcategories, and 
an analysis of variance design was applied to 
test for differences between these three groups. 


Results and Discussion 


Of the 25 indicators only 2, line emphasis 
on outline of eyes (Item 3) and heavy line 
emphasis overall (Item 20), occurred sig- 
nificantly more frequently in the paranoid 
group. The chi square value of 6.32 for the 
eye emphasis item is significant at .02 level. 
For over-all line emphasis chi square was 


4.95, which falls at .05 confidence level. 
Since 25 tests of significance were performed 
in this analysis, it should be recognized that 
even the two significant differences obtained 
may have been due to chance. According to 
Wilkinson’s tables (4), the actual probability 
of two significant results occurring in a series 
of 25 statistical tests is .36. 

The mean figure drawing size of the para- 
noid group was 6.36 while the nonparanoids 
produced drawings which averaged 5.76. The 
t test of the difference between means resulted 
in a value of .98 which is nonsignificant and 





Figure Drawings and Paranoid Pathology 


reveals that paranoids do not typically draw 
larger figures than other patients. 

In a further analysis of the data it was 
found that the paranoids did not differ sub- 
stantially from the nonparanoids in the num- 
ber of indicators occurring in their individual 
drawings. The mean number of indicators 
present in the drawings of the paranoid group 
was 4.97, while the mean for the nonparanoid 
patients’ drawing was 4.23. The ¢ test in this 
instance yielded a nonsignificant value of 1.45. 

A final procedure entailed comparing the 
drawings of three subgroups of paranoid pa- 
tients categorized according to severity of 
paranoid symptomatology. The differences be- 
tween the patients in these groups in the num- 
ber of indicators appearing in their drawings 
(drawing scores) were not statistically sig- 
nificant (F = .24). The mean numbers of in- 
dicators in the three groups were 5.00, 4.41, 
and 5.40 for the “mildly,” “moderately,” and 
“markedly” paranoid patients, respectively. 


Summary 


An attempt was made to check the useful- 
ness of 26 drawing indicators in determining 
paranoid pathology. Figure drawings were col- 
lected from 61 hospitalized patients of varied 
diagnostic categories, evaluated specifically 
for paranoid symptomatology. Comparisons 
were then made between the figure drawings 
of 31 patients in this group classified as para- 
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noid and the remaining 30 patients judged 
entirely free of paranoid characteristics. 

Of 25 indicators, emphasis on the outline of 
the eyes and over-all heavy line emphasis oc- 
curred significantly more frequently in the 
paranoid group. It should be noted, however, 
that two significant findings in 25 statistical 
tests could easily be obtained on the basis of 
chance. The size of the figures did not differ- 
entiate between the groups. When the patients 
were compared with regard to their drawing 
scores, that is, the number of indicators pres- 
ent in their individual drawings, the groups 
again did not differ significantly. Nonsignifi- 
cant results were also obtained in a compari- 
son of the drawing scores of the “mildly,” 
“moderately,” and “markedly” paranoid pa- 
tients. 
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Causative Factors in the Production of Rotations 
on the Bender-Gestalt Designs 


Lewis D. Hannah 
Athens State Hospital 


Many studies of performance »n the Bender- 
Gestalt Visual Motor Test (1) have been 
made recently. There are 50 direct references 
to the test listed in Psychological Abstracts 
1950 through 1956. Most of these report cor- 
relations of performance on the Bender-Gestalt 
(BG) with performance on other tests or some 
manner of behavior. None of them report in- 
vestigation of reasons as to why various dis- 
tortions in reproductions are made. It seems 
that if more of the workings of mental mecha- 
nisms tending to cause distortions were known, 
the test could be used in a more realistic and 
meaningful way. 

The present study is an attempt to show 
that factors other than mental pathology tend 
to produce “abnormalities” in the way de- 
signs are reproduced. It is believed that the 
demonstration of even one way of artificially 
producing an “abnormality” in the reproduc- 
tion of designs will make necessary re-examina- 
tion of many premises that have been often 
taken for granted or even possibly shown by 
statistical methods to correlate with behav- 
ior. Repeated use of the test has suggested 
that often a “rotation” was produced by a 
subject as a function of the way the stimulus 
was presented rather than of something within 
the subject. 

The BG test as commonly used today has 
the designs printed on a card that is oriented 
horizontally. That is, the top and bottom 
edges of the card are longer than the edges on 
either side. These designs are then copied on 
paper with horizontal edges which are much 
shorter than the vertical edges. Because of 
this, a subject must rotate the reproduction 90 
degrees in order to preserve the orientation of 
the design on the card. If this is true, then 


comparable populations should produce fewer 
rotations if the stimulus card is presented to 
them with the design oriented to the short 
edge of the cards as it now is to the long edge. 
This hypothesis was tested in the following 
manner. 

First, a new set of BG cards was made. 
These cards, hereafter referred to as new BG 
cards in contrast to old BG cards, were made 
of material similar to the old cards. The design 
was made to be the same size as on the old 
cards. The length of the horizontal edges of 
the cards was also the same as the old cards. 
However, the vertical edges were extended to 
a length which made the ratio of the length of 
the horizontal edge to the vertical edge the 
same as the ratio of the 84” side to the 11” 
side of a standard sheet of paper. Sixty con- 
secutive admissions to the Athens State Hos- 
pital were tested in the traditional way using 
the old cards. Then 60 were tested in the same 
way, except that the stimulus cards used were 
the set expressly prepared for use in this ex- 
periment. 

The two groups were then compared for 
diagnosis, age, sex, etc. By comparing diag- 
noses it was possible to select 36 cases from 
one group for which a matching case could be 
found in the other group. These two groups of 
36 each were then compared by Student’s ¢ 
for difference in age and not found to be sig- 
nificantly different. There were 15 males and 
20 females in the first group, while there were 
12 males and 24 females in the second group. 

After the records were selected they were 
then scored for rotations. For this study, a 
rotation was scored only if the reproduction 
was clearly recognizable—if a naive subject 
could be expected to identify which stimulus 
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card it was copied from—and if it was clearly 
rotated at least 45 degrees. Each record was 
then given a score which was equal to the 
number of designs rotated on the record. Thus 
the minimum possible score was zero and the 
maximum possible score was eight. The scores 
of the records produced by those subjects 
presented the old cards were then compared 
with the scores earned by those who had the 
designs presented on the new cards. 


Results 


In Group I, those who had copied the de- 
signs from the new cards, eight rotations were 
produced. The average score for the group was 
.222. The highest score earned was one. 

In Group II, the highest score earned was 
five and the lowest zero. Rotations were found 
on eight of the records, and the average num- 
ber of rotations for the group was .639. 

Comparison of the scores earned was made 
by use of Student’s ¢ and it was found that the 
probability of the differences in number of 
rotations produced by the two groups being 
due to chance alone was less than .01. 


Discussion 


It is felt that the BG test is a valuable aid 
in diagnosis but that too often its limitations 
are not kept in mind when making interpreta- 
tions. Though too little is known about per- 
ceptual mechanisms to explain satisfactorily 
all “abnormalities” produced by subjects asked 
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to copy the BG designs, this study has demon- 
strated one way that “abnormalities” are pro- 
duced as a function of the way the stimulus 
design is oriented on the card on which it is 
presented. It is believed that the use of the 
“new” cards described in this report will 
eliminate the number of false positives called 
by the BG. What, in terms of diagnosis or 
behavior correlates, the positives are, still re- 
mains to be established, and interpretations 
must still be made almost entirely on the 
basis of personal experience with the test. 


Summary 


It is believed that factors other than organic 
or functional impairment of mental function- 
ing will produce rotations in reproductions of 
the BG cards. To test this, two groups matched 
for age, sex, and psychological diagnosis were 
given the BG test. The stimulus cards for the 
two groups were different in that designs on 
one were printed with the card oriented hori- 
zontally, while for the other test the cards 
were oriented vertically. The group which was 
presented the vertical cards produced fewer 
rotations than the control group which was 
presented standard BG cards. The results were 
significant at the .01 level of confidence. 
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Sex Differences on the Rorschach’ 
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The data to be summarized were gathered 
as part of a larger study concerned with the 
role of stimulus attributes in determining Ror- 
schach responses. It was in relation to this 
larger project that the problem of sex differ- 
ences arose, since it was not possible to ob- 
tain equal numbers of males and females for 
the study. Numerous investigations in the 
literature pose an interpretive problem in this 
regard, although usually the problem tends to 
be ignored. 

Rorschach (9) anticipated that males and 
females as groups would respond differentially 
to inkblots even though it was not possible to 
determine the subject’s (S’s) sex from the 
record of the test. Standard references (2, 3, 
6, 7, 8, 10, 11) devote little or no attention 
to this question, however, despite the fact that 
it is an elementary one in test construction 
theory. The few empirical studies that have 
been completed are limited to children and 
adolescents as Ss (1, 5, 12). If sex differences 
do exist, one might expect these differences to 
show up most clearly in the records of adults. 
The reason for this is the fact that constitu- 
tional differences may be reinforced through 
cultural learning, producing more clear-cut 


separation of the sexes as development pro- 
ceeds.” 


Procedure and Results 


Two examiners (one male, one female) 
tested a total of 162 Ss (29 males) following 


1This research is part of a larger project sup- 
ported by Grant M-1027 from the National Insti- 
tute of Mental Health, U. S. Public Health Service. 
We wish to thank Hans Grainer and Patricia M. 
Fossum for administering the tests. Frances Alex- 
ander and Paul Fiddleman assisted with the calcu- 
lations. 

2 It is possible that learning effects may reduce sex 
differences in a given culture, but in western culture 
this is not usually held to be the typical outcome of 
training. 


Beck’s (3) procedure. Twenty-six males were 
then selected for whom close matches could 
be found in the female group as far as the 
variables of age and intelligence were con- 
cerned. Matching selections were carried out 
by a person unacquainted with the Rorschach 
protocols. Selections were also made so that 
each examiner tested equivalent numbers of 
males and females. All Ss were home-office 
employees of two insurance companies in a 
southern city.* The ages of the Ss ranged be- 
tween 23-50 (x = 33 years); Otis IQ scores 
ranged between 95-135 (7 = 111). In this 
way two groups of 26 Ss (one male, one fe- 
male) were composed which were closely 
matched on age, intelligence, and place of 
employment. It was not possible, however, to 
equate the groups on education, more college 
males than college females being included in 
the samples. 

The significance of differences between the 
two groups on 17 summary scores (R, W, D, 
Dd, S, M, FM, m, 3M, FM, and m, F + %, 
T/1R, T/R, Total Time, 8-10/R%, Number 
of Content Categories, A%, and P) was then 
evaluated by means of chi square.* Only one 
score, FM, revealed a significant difference 
between the sexes at the .05 level, male Ss 
giving more animal movement responses than 
female Ss. 

In order to check on the stability of this 
one positive finding, two additional samples 
of 21 males and 21 females were drawn from 
a group of £1 Ss who had responded to an 


8 The authors are indebted to the Jefferson Stand- 
ard Life Insurance Co. and the Pilot Life Insurance 
Co. for their generous contribution of employee time 
and testing facilities. 

*The markedly skewed nature of most distribu- 
tions of Rorschach scores makes chi square a pre- 
ferred technique for analysis. The relative insensi- 
tivity of chi square, unless samples are large, needs 
to be recognized however. 
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achromatic series of Rorschach blots. Al- 
though devoid of color, previous analyses of 
the data had shown that the frequency of FM 
responses was not affected by the presence 
or absence of color. These two groups were 
matched on the productivity (R) variable 
since precise matching on the other variables 
previously used for this purpose was not fea- 
sible. Analysis of the frequency distributions 
of FM responses by chi square failed to con- 
firm the earlier finding (p > .30). 


Discussion 


One other statistical study of Rorschach 
sex differences has been reported recently. 
Felzer (4) checked 26 variables, finding two 
sgnificant differences, T7/R and FC. In his 
samples of college students, males took more 
time per response, and they also gave fewer 
FC responses. The females in the present 
study, in contrast, took longer for each re- 
sponse, although the difference did not reach 
a statistically significant level. The data do 
not provide a good check on the color (also 
shading) variable, since an experimental in- 
quiry procedure was used which renders the 
data noncomparable; hence, tests for these 
scores have not been reported and Felzer’s 
findings regarding the FC variable must be 
checked by later work. The data from both 
studies, however, suggest the improbability 
that stable differences in summary scores be- 
tween adult males and females can be demon- 
strated; an occasional significant difference in 
one study is not likely to be confirmed by a 
subsequent study. 

There is a broader issue than Rorschach 
procedure involved in the present discussion; 
it concerns the relationship between theories 
of personality and the instruments used to 
assess personality. Most dynamic theories of 
personality point out real differences between 
males and females, related to their very dif- 
ferent constitutional endowments, as well as 
the different types of subsequent critical de- 
velopmental experiences which they must 
typically confront. If theory is correct in this 
regard, then one should expect the organiza- 
tional properties and dynamics of personality 
to be distinguishably different in males and 
females. It does not appear psychologically 
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sound, therefore, to construct personality 
tests so as to equate the sexes; nor can a 
given technique, such as the Rorschach, be 
defended as sufficiently adequate if sex dif- 
ferences cannot be demonstrated. 


Summary 


Groups of males and females (N = 52) 
were matched for age and intelligence. Ror- 
schach summary scores for the two groups 
were compared in order to determine if sex 
differences existed; no stable differences were 
found. A larger question of the relationship 
between personality theory and sex differences 
as related to assessment devices was also dis- 
cussed. The proposition was defended that 
sensitive personality devices should reflect 
sex differences. 


Received August 23, 1957. 
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